From jlam at iunknown.com Wed Dec 1 00:50:24 1999 From: jlam at iunknown.com (John Lam) Date: Mon Jun 7 17:18:07 2004 Subject: XML4J EA2 --> Xerces-J 1.0 Message-ID: <1B79E83E7849174A813044A2E56F78040C09@AROD.iunknown.com> Will IBM continue development of XML4J independently of Xerces-J? Or will Xerces-J be the "official" version of that source code base? -John -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Mike Pogue Sent: Tuesday, November 30, 1999 12:38 PM To: xml-dev@ic.ac.uk Subject: Re: XML4J EA2 --> Xerces-J 1.0 Eric Ulevik wrote: >From: Mike Pogue >> The Xerces-J parser (the Apache name for what IBM calls XML4J EA2) is >> both compliant (including passing one test that we disagree with your >> interpretation of the spec on), and is freely available. Both source >> code and binaries for Xerces-J version 1 are available at >> http://xml.apache.org, with updates done frequently. >I haven't seen any updates. Just the original release. Am I mistaken? The *very* latest source (including new functionality, and some late breaking performance and memory enhancements) is available via anonymous cvs (see http://xml.apache.org for details). We'll be bundling the latest source code up into a more formal release (zip file, tarball) shortly (it's being tested right now). Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lindsey at diac.com Wed Dec 1 00:58:34 1999 From: lindsey at diac.com (William Lindsey) Date: Mon Jun 7 17:18:07 2004 Subject: XFM (or something similar) In-Reply-To: <3.0.32.19991130114807.01475710@pop.intergate.ca> Message-ID: Sean McGrath wrote: > >Do xml-dev'ers think XFM is a good idea? Tim Bray replied: > I think having a way for an instance to promise it references no external > entities is a no-brainer. [ ... snip ... ] Should we invent yet another way for the instance to tell us about itself? We already have the BOM, the XML declaration, the Document Type declaration, and the XML-Stylesheet PI. I guess it hasn't been decided how an instance is associated with a W3C Schema. Maybe we should investigate a more general way to specify all this stuff externally. It seems to fit within the scope of the problem Tim outlines in "Related-Resource Discovery for XML" [1]. Is there a W3C XML packaging activity? Best, Bill [1] http://www.textuality.com/xml/why-pkg.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Dec 1 01:17:31 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:07 2004 Subject: XFM (or something similar) Message-ID: <3.0.32.19991130171720.014ac9e0@pop.intergate.ca> At 05:57 PM 11/30/99 -0700, William Lindsey wrote: >Maybe we should investigate a more general way to specify all >this stuff externally. It seems to fit within the scope of >the problem Tim outlines in "Related-Resource Discovery for XML" [1]. >Is there a W3C XML packaging activity? Working on it. I think it's important. Will know more at the end of next week. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Dec 1 01:55:06 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:07 2004 Subject: XML processing instruction survey References: <000b01bf3b87$b13c5c50$0f36a8c0@quokka.com> Message-ID: <016c01bf3b9f$19a5fa50$eb020a0a@bowstreet.com> > I'm interested in the extent to which people are actually using the XML > processing instruction ( they find it useful. You mean the XML Declaration? It is clearly useful if you use a different character encoding to UTF-8 (hence also US-ASCII) or UTF-16. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Dec 1 02:41:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:07 2004 Subject: XML processing instruction survey In-Reply-To: "James Tauber"'s message of "Tue, 30 Nov 1999 20:55:07 -0500" References: <000b01bf3b87$b13c5c50$0f36a8c0@quokka.com> <016c01bf3b9f$19a5fa50$eb020a0a@bowstreet.com> Message-ID: "James Tauber" writes: > > I'm interested in the extent to which people are actually using the XML > > processing instruction ( which > > they find it useful. > > You mean the XML Declaration? > > It is clearly useful if you use a different character encoding to UTF-8 > (hence also US-ASCII) or UTF-16. The XML Declaration (not a PI) is also very useful for forwards-compatibility, so that in the future XML 1.1 (etc.) parsers will know that they are dealing with XML 1.0 and can either [a] apply the appropriate rules or [b] die gracefully, depending on the requirements of future XML specs, if any. The bad news is that level-three browsers do ugly things when they see the XML declaration, but they do ugly things with lots of XML syntax (since they're not actually XML browsers). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Dec 1 02:42:56 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:08 2004 Subject: joe stephenson Message-ID: <026501bf3ba5$c7551db0$eb020a0a@bowstreet.com> I thought Joe Stephenson was supposed to be back today. James :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Dec 1 02:55:57 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:08 2004 Subject: XML processing instruction survey Message-ID: <3.0.32.19991130185652.01516ec0@pop.intergate.ca> At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote: >I'm interested in the extent to which people are actually using the XML >processing instruction ( they find it useful. It's not really designed for people. It's mostly designed for use by the XML processor to help figure out the encoding and make sure that this is really XML. I'd think that using it at the application level would be not only uncommon but probably unwise. I'd be interested to hear any positive responses to the query. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Dec 1 03:24:58 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:08 2004 Subject: SML - a vote against In-Reply-To: <3.0.32.19991130100957.01cb0330@nexus.webmethods.com> Message-ID: Hi Joe, Joe said: Instead, what I'd like to see is the codification of subsets and recommendations for domains of use of these subsets. For example, in the domain of XML for business messaging, if not for all of XML-for-data, I'd like to see a formal recommendation to avoid both entity declarations and mixed content. I'd like to make it easy for someone who knows their domain of use to identify exactly what they need to learn about XML and to learn just those pieces. Applications could advertise conformance with various recommendations to ease both learning to use the application and integrating with the application. Didier reply: Independently of SML there is a need for messaging convention otherwise this convention is defined by a manufacturer as Microsoft is trying with Biztalk. The whole thread I tried to bring out about meta data and message is about this. Biztalk after all only add some meta data to an XML document like for instance, for what/whom is this message, from what/whom is this message, to which process this message is part of, etc... All these things are in fact meta information about the document transported from A to B. However, to realize this, what is missing now in the XML framework is: a) the ability to validate a document fragment or the whole document as an aggregation of fragments. If we get that, then it will be possible to build a message that would include the meta data about a document and the document itself in the same text "package". Off course this is a need for e-commerce transaction and probably not something that could be applied to other kind of documents. Cheers Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jlapp at webmethods.com Wed Dec 1 05:49:02 1999 From: jlapp at webmethods.com (Joe Lapp) Date: Mon Jun 7 17:18:08 2004 Subject: XML processing instruction survey In-Reply-To: <3.0.32.19991130185652.01516ec0@pop.intergate.ca> Message-ID: <199912010548.VAA24908@hawk.prod.itd.earthlink.net> We parse both XML and HTML, and you can configure whether to use the presence of the declaration to make the distinction. I've always dreamt of having an indicator in this declaration that tells me whether the document includes any GE references besides refs to the predefined GEs. I can get better throughput when I know they aren't there, and right now you have to configure the behavior up front. I'd really like to autodetect on a per document basis. ... always pushing to squeeze through a few more docs per sec. At 06:56 PM 11/30/1999 -0800, Tim Bray wrote: >At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote: >>I'm interested in the extent to which people are actually using the XML >>processing instruction ( >they find it useful. > >It's not really designed for people. It's mostly designed for use >by the XML processor to help figure out the encoding and make sure that >this is really XML. > >I'd think that using it at the application level would be not only >uncommon but probably unwise. I'd be interested to hear any positive >responses to the query. -T. > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To unsubscribe, mailto:majordomo@ic.ac.uk the following message; >unsubscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- Joe Lapp (Looking for some good people to help design Senior Engineer and build the Internet's business-to-business webMethods, Inc. XML infrastructure. We are 100% Java.) jlapp@webMethods.com http://www.webMethods.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From varun at chennai.tcs.co.in Wed Dec 1 09:37:13 1999 From: varun at chennai.tcs.co.in (V Arun Kumar) Date: Mon Jun 7 17:18:08 2004 Subject: creating a DOM tree from streams of tagged data????????? Message-ID: <6525683A.0034B3B9.00@MAILSERVER2.chennai.tcs.co.in> i m using sun's parser. i am able form a DOM tree provided i read from xml file using the parser my problem goes like this i hav a html page in the front end . Upon submitting ,a stream of XML data is sent to a servlet . Now, how can i construct a DOM tree from this string of XML data ?????? any help would be appreciated xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From philipnye at freenet.co.uk Wed Dec 1 11:13:15 1999 From: philipnye at freenet.co.uk (Philip Nye) Date: Mon Jun 7 17:18:08 2004 Subject: SML - an alternative References: <013601bf3b5c$d5b1e9e0$e9d9f2cc@omicron.com> Message-ID: <384502F0.5D7EA2E0@freenet.co.uk> "Stephen T. Mohr" wrote: > > People who have built parsers claim the alleged complexity of XML isn't a > problem for them *as XML stands*. Not having built one of my own, I'll take > their word for it. Then why do most XML parsers: a. choose to exclude an arbitrary set of features e.g. PIs, external entities etc.? b. consistently fail conformity tests? This is the foundation on which a huge superstructure is rapidly being built willy-nilly. Philip -- Philip Nye Engineering Arts 72 Herberton Road ~ Bournemouth BH6 5HZ ~ UK tel +44 (0)1202 418236 ~ fax +44 (0)1202 418676 mailto:philipnye@freenet.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From IndrajitC at catsglobal.co.in Wed Dec 1 12:46:14 1999 From: IndrajitC at catsglobal.co.in (Indrajit Chaudhuri) Date: Mon Jun 7 17:18:08 2004 Subject: creating a DOM tree from streams of tagged data????????? References: <6525683A.0034B3B9.00@MAILSERVER2.chennai.tcs.co.in> Message-ID: <384518C0.9DCA0F5E@catsglobal.co.in> First create an InputSource Object with the byte/character stream and then pass it to the parser. Examples are there in the api docs which comes along with the parser. Thanks, Indrajit V Arun Kumar wrote: > > i m using sun's parser. > i am able form a DOM tree provided i read from xml file using the parser > > my problem goes like this > i hav a html page in the front end . > Upon submitting ,a stream of XML data is sent to a servlet . > Now, how can i construct a DOM tree from this string of XML data ?????? > any help would be appreciated > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tshw at capitalmarketscompany.com Wed Dec 1 14:23:51 1999 From: tshw at capitalmarketscompany.com (Shaw Tim) Date: Mon Jun 7 17:18:08 2004 Subject: NOT SML but XML-for-data and business Message-ID: I am an Application tool/framework developer/integrator. I sit (sort of) half-way between XML Parser-writers and XML tool-vendors. I developed a pseudo-DOM system, using the DocumentHandler mechanism of any available SAX parser, which creates 'lightweight' DOM objects specifically designed for data-processing. The DOM objects implement the required interface, but not all of the methods 'do' things. Another set of interfaces provides a 'data-oriented' view on the DOM structures (and it is this which the lwDOM is optimised for). This allows me to use DOM-based tools (XSLT etc), but removes the neccessity for programmers to learn/use the DOM directly. I see XML Schema (among other things) as providing great opportunities in this domain - data-types/constraints/ranges etc., but I don't have a clear view of how this will be integrated. I have 3 immediate questions (not all of which need to be answered at once :-) : 1) how are the (real) XML developers approaching XML Schema? Will it be transparent to applications, and just constrain the data as per DTD's or will the meta-data be available - if so, in what form? 2) given the fragmentation of DTD definitions in any given market (mine particularly with FpML, FIXML, FinML, BizTalk etc.) is anyone addressing general tools for mapping between these - and what are the issues that are being tackled? 3) are there any discussions going on about how to map between different formats programatically? I can imagine RDF being used, but then there needs to be agreement on meta-meta-data(!) - and I can't imagine (see 2 above) people agreeing on anything much when it's a core business differentiator (read tie-in/revenue opportunity) to have a proprietary format. Thanks tim ********************************************************************* The information in this email is confidential and is intended solely for the addressee(s). Access to this email by anyone else is unauthorised. If you are not an intended recipient, you must not read, use or disseminate the information contained in the email. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of The Capital Markets Company. http://www.capitalmarketscompany.com *********************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Dec 1 15:13:50 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:08 2004 Subject: NOT SML but XML-for-data and business In-Reply-To: Message-ID: Hi Shaw, Shaw said: 1) how are the (real) XML developers approaching XML Schema? Will it be transparent to applications, and just constrain the data as per DTD's or will the meta-data be available - if so, in what form? Didier reply: Actually we have a problem. There is no recommendations about how to link a document/fragment with its schema and nor any recommendation on about the validation rules for documents aggregating fragment (each fragment having its own schema). I an anxious to see what the XML schema group will present at XML 99 as an answer to these question or to these unfulfilled needs. Shaw said: 2) given the fragmentation of DTD definitions in any given market (mine particularly with FpML, FIXML, FinML, BizTalk etc.) is anyone addressing general tools for mapping between these - and what are the issues that are being tackled? Didier reply: We are using XSLT to translation from one to the other. We are currently working on a meta model that could be mapped to these different languages. However, this is a lot of work even for a specific domain (the finance domain). But if we reach our goal. It will be easier to translate from this meta model (or meta language which is XML based) to any other particular XML based language with XSLT. Shaw said: 3) are there any discussions going on about how to map between different formats programatically? I can imagine RDF being used, but then there needs to be agreement on meta-meta-data(!) - and I can't imagine (see 2 above) people agreeing on anything much when it's a core business differentiator (read tie-in/revenue opportunity) to have a proprietary format. Didier reply: We tried RDF but RDF is a completely different data model and we discovered that to create a meta data model in RDF is simply not practical. We discovered that it is a lot easier if the meta domain language is simply an element hierarchy. Not necessarily easier for machines but definitively easier for humans. And believe me, to reduce error is quite important. The problem is that errors will introduced by humans. If the system is too complex, we have errors. So we re-discovered what Ben Schederman discovered several years ago. Thus, from the software psychology point of view, we discovered that using RDF for translation is error prone and that using a master schema translated into other schemas is less error-prone. However, this experience is limited to our group and it would be interesting to see what others got as result. Of course, this is possible only if they keep track of their process and have in place a learning mechanism to fine tune these processes. Cheers Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Wed Dec 1 15:37:25 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:18:08 2004 Subject: [ANNOUNCE] EZ/X - XML/XSL Processor Preview Now Available Message-ID: <384540A5.7486C26C@activated.com> *********************************************************** *********************************************************** EZ/X - XML/XSL Processor Preview Now Available Download EZ/X now: web =======> http://www.activated.com/download/ezx.zip http://www.activated.com/download/ezx.tar.gz ftp =======> ftp://ftp.activated.com/ezx.zip ftp://ftp.activated.com/ezx.tar.gz feedback ==> mailto:ezx-feedback@activated.com *********************************************************** *********************************************************** Activated Intelligence (http://activated.com) invites you to preview our EZ/X suite of core XML tools for Java. EZ/X combines world-class XML parsing and XSL processing in a compact, pure Java package. We're seeking a major industry partner who can leverage EZ/X as part of its XML leadership strategy (and hopefully make it FREE to you!) We welcome your insights about how Activated should move forward with this product. EZ/X has been in production use for over a year at the JavaLobby (http://javalobby.org) - which was probably the world's first 100% dynamically generated XML/XSL site. EZ/X has delivered consistently there under grueling circumstances and extreme heavy loads. We've worked hard to make EZ/X fast, reliable, and conformant to prevailing standards. Preliminary testing by Activated and third-parties suggests that EZ/X should give a great performance boost to your mission-critical XML projects. XSL processing with EZ/X is usually 2-3 times faster than Lotus/IBM/Apache or Oracle, and even faster than that when dealing with complex XML/XSL. We hope EZ/X works as well for you as it has for us, and we hope it helps propel your success with XML. We look forward to your comments, ideas and suggestions, and we urge you to send them to ezx-feedback@activated.com (mailto:ezx-feedback@activated.com). If you like what you see, then please let us know - your support makes all the difference. Best regards, The Activated EZ/X Team ------------------------ Activated Intelligence http://www.activated.com (919) 678-0300 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Joel_Sherriff at compuware.com Wed Dec 1 15:33:39 1999 From: Joel_Sherriff at compuware.com (Sherriff, Joel) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions Message-ID: I've posted a variation of this to comp.text.xml also, so if it looks familiar, that's where you saw it... I'm writing an xml document analysis tool and would appreciate any links to xml that contain binary entities, complicated stylesheets (css or xsl) that contain external dependencies, or any other use of external dependencies. I've poked around www.xmltree.com and, though there are quite a few xml links, I've yet to find any that are more than text. The purpose of the analysis tool is to list any links within the xml to outside documents and any URLs that must be read to fully process the xml (ie: external dtd, stylesheet, gif images, etc). Can somebody explain, in a nutshell, the purpose of RDF. After reading the spec, I don't feel any more educated than before. Because I need the analysis tool to be lightweight and fast (it'll be a component in a load-testing tool), and need to support as many different standards as possible, I've implemented it using lex, as opposed to one of the available parsers. However, after looking at a few RDF examples it appears that RDF can be used to "construct" URL's, which looks to be a trouble spot for me. Any of the experts on the list think of any other potential trouble spots? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From beavis at proteometrics.com Wed Dec 1 16:11:02 1999 From: beavis at proteometrics.com (Ronald Beavis) Date: Mon Jun 7 17:18:09 2004 Subject: No subject Message-ID: <004901bf3c17$93d64230$8c3770cc@pmc2> unsubscribe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991201/e1440244/attachment.htm From tbray at textuality.com Wed Dec 1 16:18:28 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions Message-ID: <3.0.32.19991201080509.0132f800@pop.intergate.ca> At 10:34 AM 12/1/99 -0500, Sherriff, Joel wrote: >Can somebody explain, in a nutshell, the purpose of RDF. After reading the >spec, I don't feel any more educated than before. My attempt to explain RDF is at http://www.xml.com/xml/pub/98/06/rdf.html -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RDaniel at DATAFUSION.net Wed Dec 1 20:23:04 1999 From: RDaniel at DATAFUSION.net (Ron Daniel) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions Message-ID: <0D611E39F997D0119F9100A0C931315C52FB4A@datafusionnt1> Tim Bray said: At 10:34 AM 12/1/99 -0500, Sherriff, Joel wrote: >Can somebody explain, in a nutshell, the purpose of RDF. After reading the >spec, I don't feel any more educated than before. My attempt to explain RDF is at http://www.xml.com/xml/pub/98/06/rdf.html -Tim You may also want to take a look at Tim Berners-Lee's document on "Describing and Exchanging Data": http://www.w3.org/1999/04/WebData but here is my attempt to explain RDF's purpose in a nutshell: The purpose of RDF is to provide metadata (data about other data) in a manner that is very easy to process by machines. and here is my attempt to give an example of why RDF is useful: Assume you have a bunch of XML documents from a variety of sources, many of which contain an element, and your job is to build a simple card-catalog style database so that you can search by author and get the documents written by that person. Also assume you have an XML-aware version of a tool like grep that lets you search the documents for the element. Like grep, this tool prints the filename where a match was found. Unlike grep it prints the content of the matched element rather than a line. (This seems like a reasonable minimum functionality for an XML-aware grep-like tool). That tool should make the job easy. Search for elements, pipe the output to 'cut', and you can make a text file ready for import into your database. But there is one little hitch - you can't assume that the person identified in the element of file X is the author of file X. Maybe file X is saying that they are really the author of document Y. Without knowledge of the convention followed in each file you can't tell. And since the files came from a lot of sources, you are talking a lot of work to see what conventions are being followed. RDF does not leave this important information implicit. Each RDF statement has exactly three parts: Subject - the thing being talked about (the documents in the example above). Predicate - the type of statement being made about the subject (author in this example) Object - the value portion of the statement (the author name in this example). If the data was expressed in RDF, an RDF-aware grep-like tool would let you select all the RDF properties labeled "author", get the URIs of the resource and the name of the author, and plop that info into the database. There would be no ambiguity about the thing which was authored. This regularity in the form of expression is key to making the metadata easy to process by machines. Regards, Ron xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Dec 1 20:39:26 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions Message-ID: <3.0.32.19991201124035.0153b920@pop.intergate.ca> At 12:22 PM 12/1/99 -0800, Ron Daniel wrote: >and here is my attempt to give an example of why RDF >is useful: Very good, Ron. Well said. And here is my attempt to explain why RDF hasn't been more successful: The syntax is hideously ugly and hard to understand, and the spec worries so hard about being correct and complete that it is pretty well 100% incomprehensible to ordinary people. I probably just hurt some feelings, but I've already shouted this in private enough times that it won't be a surprise. In my opinion RDF needs some serious sugar-coating and tutorializing if it is ever going to achieve its potential. I think its potential is huge, dwarfing that of XML. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LWatanab at JetForm.com Wed Dec 1 21:15:34 1999 From: LWatanab at JetForm.com (Larry Watanabe) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions Message-ID: <111CF63B7D2ED211830000805F65A2FF01804961@OTTMAIL2> Ron Daniel writes: >RDF does not leave this important information implicit. Each >RDF statement has exactly three parts: > Subject - the thing being talked about (the documents in > the example above). > Predicate - the type of statement being made about the subject > (author in this example) > Object - the value portion of the statement (the author name > in this example). >If the data was expressed in RDF, an RDF-aware grep-like >tool would let you select all the RDF properties labeled >"author", get the URIs of the resource and the name of the >author, and plop that info into the database. There would >be no ambiguity about the thing which was authored. This works fine for inherently binary relations, but for n-ary relations you end up reifying them by introducing a dummy node. Matching against that dummy node will yield no matches, or only incorrect ones, since the names of the nodes are supposed to be new constants (or existentially quantified variables). To make that dummy node meaningful, you would have to match a wildcard against it and other relations. But then you're back to your original sitatuation of not knowing what the relation means unless you have further knowledge of the semantics of the relations. -Larry Watanabe Jetform Corporation lwatanab@jetform.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Dec 1 21:44:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions In-Reply-To: Tim Bray's message of "Wed, 01 Dec 1999 12:40:37 -0800" References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> Message-ID: Tim Bray writes: > And here is my attempt to explain why RDF hasn't been more successful: > > The syntax is hideously ugly and hard to understand, and the spec worries > so hard about being correct and complete that it is pretty well 100% > incomprehensible to ordinary people. > > I probably just hurt some feelings, but I've already shouted this in private > enough times that it won't be a surprise. I think Tim has shouted it in public as well. It's a shame, because RDF is very nice for exchanging object-oriented information among loosely-coupled systems, and there's some good Perl and Java support for it already available (I'm sure the Python people will get in there quickly). The problem is that the RDF-Syntax spec confounds even its bravest readers by trying to do two things at once: a) define a model and syntax for exchanging object-oriented information in XML; and b) apply the model and syntax to the problem domain of representing knowledge about Web pages. Neither of those two things is brain-dead simple, but either alone could have been presented clearly and straight-forwardly to an intelligent reader who knew the domain. Let this be a warning to us all to write our specs in clean, simple layers. > In my opinion RDF needs some serious sugar-coating and tutorializing > if it is ever going to achieve its potential. And lots of software. > I think its potential is huge, dwarfing that of XML. -Tim Agreed. XML is just syntax, and as Tim (I think) has said, syntax is boring: XML simply represents a low-level syntactic layer that we all had to agree on and get out of the way so that we could move on to the tasty stuff. XML was never supposed to be the point of the whole exercise, any more than IP or TCP was supposed to be the point of the Internet or the Web. RDF is much closer to that tasty stuff. The ability to exchange object-oriented information seemlessly among heterogenous systems is very, very exciting -- it's something that CORBA promised and failed to deliver outside the enterprise, and now RDF (and XML) can take a shot at it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Wed Dec 1 22:04:32 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201124035.0153b920@pop.intergate.ca> Message-ID: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> At 12:40 PM 12/1/99 -0800, Tim Bray wrote: > >And here is my attempt to explain why RDF hasn't been more successful: > > The syntax is hideously ugly and hard to understand, and the spec worries > so hard about being correct and complete that it is pretty well 100% > incomprehensible to ordinary people. Agree. I've written a product that used MCF (RDF's predecessor) and written schemas for OODBs, and I can't make much sense of the RDF spec. Maybe it is semi-obvious to anyone with a background in knowledge representation, but it needs to be explained differently for the other 99.99% of us. >I think its potential is huge, dwarfing that of XML. -Tim I disagree on this one. It's rare that metacontent is more valuable than content, long-term. I'll bet on the books over the card catalog, every time. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Wed Dec 1 22:20:07 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> Message-ID: <38459FA4.DAA00E35@praxis.cz> David Megginson wrote: > It's a shame, because RDF is very nice for exchanging object-oriented > information among loosely-coupled systems, and there's some good Perl > and Java support for it already available (I'm sure the Python people > will get in there quickly). I can see the enormous interest in having a text-based format for exchanging object-oriented data. But can't this be done with a good object-oriented XML schema language, of the which the current W3C seems to be a very good start? I read Tim's paper on xml.com, and he argues against the use of XML (without an additional syntax) for representing metadata with two arguments, both based on scalability. But if metadata are attached in a separate document, as RDF metadata would presumably be, they could expressed just as concisely (and probably more so) by using an XML instance based on a simple (but extensible) XML schema. The latter would be preferably because am extensible schema language is an important tool in its own right. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Wed Dec 1 22:37:34 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201124035.0153b920@pop.intergate.ca> Message-ID: <000301bf3c4c$85bde920$0f36a8c0@quokka.com> As an RDF user, I agree with all comments about the complexity of its syntax and the specification itself. I spent about a month reading and rereading the RDF spec before I concluded it really was as conceptually simple as it had appeared on first reading. On the subject of its potential, I partly agree and partly disagree. Yes, RDF does move things up the semantic food chain. Yes, XML is kind of like a good orthogonal machine instruction set, which needs 3G, 4G, and 5G languages on top of it. But I still see RDF as being useful for metadata, not for every kind of object-oriented conversation you'd want to have. I wouldn't consider RDF at the same level as CORBA, but perhaps part of an overall solution. Jeff > -----Original Message----- > From: owner-xml-dev@ic.ac.uk > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Tim Bray > Sent: Wednesday, December 01, 1999 12:41 PM > To: 'xml-dev@ic.ac.uk' > Subject: RE: Some questions > > > At 12:22 PM 12/1/99 -0800, Ron Daniel wrote: > >and here is my attempt to give an example of why RDF > >is useful: > > Very good, Ron. Well said. > > And here is my attempt to explain why RDF hasn't been more successful: > > The syntax is hideously ugly and hard to understand, and > the spec worries > so hard about being correct and complete that it is pretty well 100% > incomprehensible to ordinary people. > > I probably just hurt some feelings, but I've already shouted > this in private > enough times that it won't be a surprise. > > In my opinion RDF needs some serious sugar-coating and tutorializing > if it is ever going to achieve its potential. > > I think its potential is huge, dwarfing that of XML. -Tim > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Wed Dec 1 22:39:55 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:18:09 2004 Subject: Some Questions Message-ID: <000401bf3c4c$e033dea0$0f36a8c0@quokka.com> I think the important thing to remember about RDF is that it is not XML. It is fundamentally an abstract model for expressing metadata. It happens to be representable using XML, but it is different from XML. Unfortunately, this distinction is part of what makes the spec hard to read, but it's important. Jeff Sussna xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Dec 1 22:40:39 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:09 2004 Subject: Some questions In-Reply-To: Walter Underwood's message of "Wed, 01 Dec 1999 14:02:47 -0800" References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> Message-ID: Walter Underwood writes: > >I think its potential is huge, dwarfing that of XML. -Tim > > I disagree on this one. It's rare that metacontent is more > valuable than content, long-term. I'll bet on the books over > the card catalog, every time. That's just the problem with the spec -- if you forget the word/prefix "meta" completely, RDF is just an XML format for object exchange; it just happens that one possible application of those objects if metadata, and the RDF-Syntax spec mixes the two together. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Dec 1 22:40:44 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: Message-ID: Hi David, David said: The problem is that the RDF-Syntax spec confounds even its bravest readers by trying to do two things at once: a) define a model and syntax for exchanging object-oriented information in XML; and b) apply the model and syntax to the problem domain of representing knowledge about Web pages. Didier reply: I guess that what is causing the confusion right at the beginning is the triad stuff. Instead, it would have been more useful to present the concept or the atomic unit as a record or an object without the methods. But, contrary to RDB records, there is inheritance relationship between the RDF entities. So, instead of a model based on the triad "object property value" as an atom, it would have been a lot easier to say "object as a collection of properties/values". a schema is like a template, an object is just this template with slots filled (values added to properties). A template can inherit from an other template. But, from the RDF document point of view, what we always see is the objects and their associated collection of properties/values. Its funny, one of the ancestor of RDF is the MCF (not from Netscape but from Apple research/Talva ref - http://www.netfolder.com/SDK/MCF.htm and http://www.netfolder.com/SDK/MCF11.htm). This ancestor language was designed as a simple set of units and each unit having a collection of property/value. It seems that instead of being simplified it just became more obscure. its sad, it is so easy to use when well explained and understood. Obviously the choice of word like "about", "description" lead to think of data about something instead of the data being _the_ something. This is why I use a structure like this: http://www.netfolder.com ... etc... What are the gains? a) the object is location independent. b) its location is just an other property (and in fact it is a property) c) Then, it is simply an object without any reference, what is giving references is the properties. d) I can relate the object to others with properties. e) its easier to remember and understand. d) this is the object not an object about an other object. I discovered that, in some cases I want to express certain object's relationship like for instance a hierarchy. Let's say that I want to transfer the content of a directory service from one place to an other, then in that case: http://www.netfolder.com ... etc... That way, all objects are transported as a small independent hierarchy. The hierarchical relationship is expressed with the string in the about attribute. And because we are used to express relative position in a hierarchy with "/" I use it. I didn't used it for other kind of data structures. I do not know what went wrong??? Probably OCCAM was in vacations :-)) Cheers Didier PH Martin ---------------------------------------------- Email: martind@netfolder.com Conferences: Web Boston (http://www.mfweb.com) Markup 99 (http://www.gca.com) Book to come soon: XML Pro published by Wrox Press Products http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Dec 1 22:45:57 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: Matthew Gertner's message of "Wed, 01 Dec 1999 23:22:28 +0100" References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> Message-ID: Matthew Gertner writes: > I can see the enormous interest in having a text-based format for > exchanging object-oriented data. But can't this be done with a good > object-oriented XML schema language, of the which the current W3C > seems to be a very good start? Perhaps I'm a little confused, but I cannot see how the fact that a schema language itself happens to be object oriented allows you to do object exchange in XML (it doesn't hurt, but how can it help?). I've been confused before, so there's no need for panic. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at east.sun.com Wed Dec 1 22:58:47 1999 From: elm at east.sun.com (Eve L. Maler) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> Message-ID: <4.2.0.58.19991201175459.00baea50@abnaki> At 05:58 AM 10/30/99 -0400, David Megginson wrote: >That's just the problem with the spec -- if you forget the word/prefix >"meta" completely, RDF is just an XML format for object exchange; it >just happens that one possible application of those objects if >metadata, and the RDF-Syntax spec mixes the two together. When people talk about RDF, the "meta" part is what I have trouble with in general. In what way is markup not metadata? In what way are element content and attribute values not also metadata (depending on what you do with them)? It feels weird for one particular data model to claim to have cornered the metadata market. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jlapp at webMethods.com Wed Dec 1 23:21:52 1999 From: jlapp at webMethods.com (Joe Lapp) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions Message-ID: <3.0.32.19991201182252.0283eec0@nexus.webmethods.com> Hi Eve! Actually, I've brought a similar issue up with RDF group members on a few occasions. I've asked for help understanding why I'd choose the RDF syntax instead of inventing my own XML document type to represent the desired metadata. Maybe someone on this list can help me with that. At 05:59 PM 12/1/99 -0500, Eve L. Maler wrote: >When people talk about RDF, the "meta" part is what I have trouble with in >general. In what way is markup not metadata? In what way are element >content and attribute values not also metadata (depending on what you do >with them)? It feels weird for one particular data model to claim to have >cornered the metadata market. > > Eve -- Joe Lapp (Looking for some good people to Senior Engineer help create XML technologies that http://www.webMethods.com connect businesses to businesses jlapp@webMethods.com over the web.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Dec 1 23:45:32 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions Message-ID: <3.0.32.19991201154526.014c3710@pop.intergate.ca> At 02:02 PM 12/1/99 -0800, Walter Underwood wrote: >I disagree on this one. It's rare that metacontent is more >valuable than content, long-term. I'll bet on the books over >the card catalog, every time. Wow, that's a profoundly deep and strong statement, and I think at the core of the argument that *should* be happening about how to make the Web a better place. In fairness, it should be said that Walter works for a company whose search engine does the equivalent of reading all the pages of all the books on all the shelves, and trying to guess what the books mean. I used to be in that business myself. But I think metadata wins. If you count hits on Internet search engines, the Yahoo and ODP directories, which are both human-constructed metadata, absolutely wipe out any fulltext search engine you can name, even though they have orders of magnitude less sites and a lower volume of information about each. Because human-constructed metadata wins on the net just like it did in the library. RDF is important because it can facilitate the interchange of, and a certain number of the common uses of, this kind of metadata. Anyhow, just because you have a card catalogue doesn't mean you throw the library books away. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Dec 1 23:45:29 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions Message-ID: <3.0.32.19991201154632.0152f350@pop.intergate.ca> At 06:22 PM 12/1/99 -0500, Joe Lapp wrote: >Hi Eve! Actually, I've brought a similar issue up with RDF group members >on a few occasions. I've asked for help understanding why I'd choose the >RDF syntax instead of inventing my own XML document type to represent the >desired metadata. > >Maybe someone on this list can help me with that. Because the same data structures and usage patterns keep coming back across wide ranges of metadata applications, even though the world isn't about to agree on common vocabularies. So there are huge gains to be had from a common data model and transfer syntax. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jlapp at webMethods.com Wed Dec 1 23:52:16 1999 From: jlapp at webMethods.com (Joe Lapp) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions Message-ID: <3.0.32.19991201185319.0191d680@nexus.webmethods.com> At 03:46 PM 12/1/99 -0800, Tim Bray wrote: >Because the same data structures and usage patterns keep coming back across >wide ranges of metadata applications, even though the world isn't about >to agree on common vocabularies. So there are huge gains to be had from >a common data model and transfer syntax. -Tim That's a very strong motivation. But we have to balance that with another very strong motivation: making the documents easy to understand by the people who need to work with them. By designing your own doctype you can tailor the structure and the language to suit the target audience. RDF may be simple at heart, but is it reasonable to ask the average user to figure it out, to expect that the average user of metadata will even be able to grok the abstractions? I may be reiterating your earlier sentiment, but I worry that the abstractions are as much an impediment as the spec and the syntax. -- Joe Lapp (Looking for some good people to Senior Engineer help create XML technologies that http://www.webMethods.com connect businesses to businesses jlapp@webMethods.com over the web.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at east.sun.com Thu Dec 2 00:00:26 1999 From: elm at east.sun.com (Eve L. Maler) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201154632.0152f350@pop.intergate.ca> Message-ID: <4.2.0.58.19991201185138.009f5a90@abnaki> At 03:46 PM 12/1/99 -0800, Tim Bray wrote: >Because the same data structures and usage patterns keep coming back across >wide ranges of metadata applications, even though the world isn't about >to agree on common vocabularies. So there are huge gains to be had from >a common data model and transfer syntax. -Tim Not that I don't respect RDF's power, but personally, I think the key *is* common vocabularies. We may have to start small, and they may just be hub formats that get mapped to/from a lot, but agreeing on semantics is the pill that has to be swallowed. Even RDF depends on this, particularly on an open system such as the Web where you can't really control or influence the habits of content creators. If you want to indicate that you are the author of a certain page, at the very least you have to refer to a widely understood "author" semantic in order for author-criterion searching to be of any use to your audience. Whether it's an RDF property or a well-known namespace or whatever doesn't seem to matter as much. Eve -- Eve Maler Sun Microsystems elm @ east.sun.com +1 781 442 3190 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Thu Dec 2 00:14:46 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201185319.0191d680@nexus.webmethods.com> Message-ID: <000501bf3c5a$26970d60$0f36a8c0@quokka.com> It is not reasonable to ask the user to figure it out. RDF, along with much of XML, is not really suited (or at least in RDF's case, intended) for direct human access. Remember that the goal of RDF is to make it easy for MACHINES to process metadata. Users should be able to use tools that hide the details of RDF. Jeff > -----Original Message----- > From: owner-xml-dev@ic.ac.uk > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Joe Lapp > Sent: Wednesday, December 01, 1999 3:53 PM > To: xml-dev@ic.ac.uk > Subject: Re: Some questions > > > At 03:46 PM 12/1/99 -0800, Tim Bray wrote: > >Because the same data structures and usage patterns keep > coming back across > >wide ranges of metadata applications, even though the world > isn't about > >to agree on common vocabularies. So there are huge gains to > be had from > >a common data model and transfer syntax. -Tim > > That's a very strong motivation. But we have to balance that > with another > very strong motivation: making the documents easy to understand by the > people who need to work with them. By designing your own > doctype you can > tailor the structure and the language to suit the target audience. > > RDF may be simple at heart, but is it reasonable to ask the > average user to > figure it out, to expect that the average user of metadata > will even be > able to grok the abstractions? I may be reiterating your earlier > sentiment, but I worry that the abstractions are as much an > impediment as > the spec and the syntax. > > -- > Joe Lapp (Looking for some good people to > Senior Engineer help create XML technologies that > http://www.webMethods.com connect businesses to businesses > jlapp@webMethods.com over the web.) > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Thu Dec 2 00:18:39 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: <4.2.0.58.19991201175459.00baea50@abnaki> Message-ID: <000601bf3c5a$ab80c750$0f36a8c0@quokka.com> It's funny: when you talk about RDF, it seems very complex, but when you use it, it seems very simple. I am using RDF as an interchange format for metadata about assets in a distributed publishing environment. I have, for example, a photo of a car racer. I need to know who the racer in the picture is, who took the picture, what format it was taken in (i.e., JPEG), what date it was taken on, and so forth. RDF works just gorgeously for this. As to whether it's suitable for all conceivable object interchange, I don't know and I'm not sure I care since I only try to use RDF for its intended purpose (metadata). I actually think the "meta" part is what makes the spec comprehensible, because I can always return to a specific purpose. I believe that RDF went too far with its syntax to try to make virtually every XML document valid RDF. If you want to define a metadata language, define one. If you want to define a general object interchange language, define that instead. Jeff > -----Original Message----- > From: owner-xml-dev@ic.ac.uk > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Eve L. Maler > Sent: Wednesday, December 01, 1999 2:59 PM > To: 'xml-dev@ic.ac.uk' > Subject: Re: Some questions > > > At 05:58 AM 10/30/99 -0400, David Megginson wrote: > >That's just the problem with the spec -- if you forget the > word/prefix > >"meta" completely, RDF is just an XML format for object exchange; it > >just happens that one possible application of those objects if > >metadata, and the RDF-Syntax spec mixes the two together. > > When people talk about RDF, the "meta" part is what I have > trouble with in > general. In what way is markup not metadata? In what way > are element > content and attribute values not also metadata (depending on > what you do > with them)? It feels weird for one particular data model to > claim to have > cornered the metadata market. > > Eve > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Thu Dec 2 00:21:12 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: <4.2.0.58.19991201185138.009f5a90@abnaki> Message-ID: <000801bf3c5b$06a06a50$0f36a8c0@quokka.com> Well, you need both. You need the shared concept of "author" and the shared representation of an instance of that concept. XML specs of various kinds are trying to define shared representations at various semantic layers. Both vertical and horizontal vocabulary efforts (Dublin Core, BizTalk, etc.) are required to complete the equation. Jeff P.S. Please don't bash me for mentioning BizTalk. It was an arbitrary example. > -----Original Message----- > From: owner-xml-dev@ic.ac.uk > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Eve L. Maler > Sent: Wednesday, December 01, 1999 4:01 PM > To: xml-dev@ic.ac.uk > Subject: Re: Some questions > > > At 03:46 PM 12/1/99 -0800, Tim Bray wrote: > >Because the same data structures and usage patterns keep > coming back across > >wide ranges of metadata applications, even though the world > isn't about > >to agree on common vocabularies. So there are huge gains to > be had from > >a common data model and transfer syntax. -Tim > > Not that I don't respect RDF's power, but personally, I think > the key *is* > common vocabularies. We may have to start small, and they > may just be hub > formats that get mapped to/from a lot, but agreeing on > semantics is the > pill that has to be swallowed. Even RDF depends on this, > particularly on > an open system such as the Web where you can't really control > or influence > the habits of content creators. If you want to indicate that > you are the > author of a certain page, at the very least you have to refer > to a widely > understood "author" semantic in order for author-criterion > searching to be > of any use to your audience. Whether it's an RDF property or > a well-known > namespace or whatever doesn't seem to matter as much. > > Eve > -- > Eve Maler Sun Microsystems > elm @ east.sun.com +1 781 442 3190 > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 00:28:12 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: "Jeffrey E. Sussna"'s message of "Wed, 1 Dec 1999 14:36:33 -0800" References: <000301bf3c4c$85bde920$0f36a8c0@quokka.com> Message-ID: "Jeffrey E. Sussna" writes: > I wouldn't consider RDF at the same level as CORBA, but perhaps part > of an overall solution. Though I'm the one that brought CORBA into the discussion, I think that a better comparison would probably be XMI, since CORBA is a protocol rather than a format. RDF is simpler and less rigidly defined than XMI, and it is nicely extensible -- that makes it much more suitable, say, for distributing data in the decentralized, undisciplined environment of the Web, and much less suitable, say, for direct Java-to-C++ communication in a well-defined system with a single architecture. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 00:35:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:10 2004 Subject: Some questions In-Reply-To: Joe Lapp's message of "Wed, 01 Dec 1999 18:22:52 -0500" References: <3.0.32.19991201182252.0283eec0@nexus.webmethods.com> Message-ID: Joe Lapp writes: > Hi Eve! Actually, I've brought a similar issue up with RDF group members > on a few occasions. I've asked for help understanding why I'd choose the > RDF syntax instead of inventing my own XML document type to represent the > desired metadata. The answer is the same as the answer to why you wouldn't invent your own markup language rather than XML -- that there is a network effect to using the same format as other people. In the case of RDF, it is a lot easier to do something like RDFCollection coll = new RDFCollection("http://www.foo.com/data.rdf"); RDFResource res = coll.getResource("http://www.foo.com/ids/00001"); System.out.println("The name is " + res.getProperty("http://www.foo.com/ns#name")); than it is to set up a SAX handler or walk through a DOM tree to try to get the information -- if RDF (or something like it) catches on, presumably we'll also get visual modelling tools, SQL-mapping tools, forms-generators, and lots of other nice COTS stuff that it's hard to write for XML in the general case. There are two catches, however: a) a lot of people have to use the same standard; and b) there has to be a good software base. RDF hasn't fully met either criterion yet, though there's some improvement. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 00:31:43 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:11 2004 Subject: Some questions In-Reply-To: "Didier PH Martin"'s message of "Wed, 1 Dec 1999 17:35:52 -0500" References: Message-ID: "Didier PH Martin" writes: > Obviously the choice of word like "about", "description" lead to think of > data about something instead of the data being _the_ something. This is why > I use a structure like this: > > > http://www.netfolder.com > ... etc... > > > What are the gains? > a) the object is location independent. Or, in programming terms, the ID is local rather than global, or in Web terms, it is relative rather than absolute (note that RDF allows ID as well). That's suitable for some applications, but entirely useless for others (it's often important to have single global identifiers for well-known people, places, and things). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 00:38:37 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:11 2004 Subject: Some questions In-Reply-To: Joe Lapp's message of "Wed, 01 Dec 1999 18:53:20 -0500" References: <3.0.32.19991201185319.0191d680@nexus.webmethods.com> Message-ID: Joe Lapp writes: > RDF may be simple at heart, but is it reasonable to ask the average > user to figure it out, to expect that the average user of metadata > will even be able to grok the abstractions? I may be reiterating > your earlier sentiment, but I worry that the abstractions are as > much an impediment as the spec and the syntax. I don't think so -- the average user never even caught on to the HTML element. Personally, I'm much more interested in RDF for B2B data exchange than I am in convincing Jane User to stick RDF in her Web pages. Besides, B2B is where the money and excitement is right now. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 00:52:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:11 2004 Subject: Some Questions In-Reply-To: "Jeffrey E. Sussna"'s message of "Wed, 1 Dec 1999 14:39:05 -0800" References: <000401bf3c4c$e033dea0$0f36a8c0@quokka.com> Message-ID: "Jeffrey E. Sussna" writes: > I think the important thing to remember about RDF is that it is not XML. It > is fundamentally an abstract model for expressing metadata. It happens to be > representable using XML, but it is different from XML. Unfortunately, this > distinction is part of what makes the spec hard to read, but it's important. That's a good point, and it's important to remember that it applies to almost *everything* that can be represented in XML. Even traditional document types like HTML or DocBook really have their own model (hence the HTML DOM). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Dec 2 01:10:09 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:18:11 2004 Subject: Some questions In-Reply-To: <4.2.0.58.19991201185138.009f5a90@abnaki> Message-ID: On Wed, 1 Dec 1999, Eve L. Maler wrote: > At 03:46 PM 12/1/99 -0800, Tim Bray wrote: > >Because the same data structures and usage patterns keep coming back across > >wide ranges of metadata applications, even though the world isn't about > >to agree on common vocabularies. So there are huge gains to be had from > >a common data model and transfer syntax. -Tim > > Not that I don't respect RDF's power, but personally, I think the key *is* > common vocabularies. We may have to start small, and they may just be hub > formats that get mapped to/from a lot, but agreeing on semantics is the > pill that has to be swallowed. Even RDF depends on this, particularly on > an open system such as the Web where you can't really control or influence > the habits of content creators. If you want to indicate that you are the > author of a certain page, at the very least you have to refer to a widely > understood "author" semantic in order for author-criterion searching to be > of any use to your audience. Whether it's an RDF property or a well-known > namespace or whatever doesn't seem to matter as much. I don't disagree with any of this except the last claim; both matter IMHO. What really matters above all is the use of unique identifiers (in Web context, URIs) both for the concepts/objects defined in a vocabulary and those named in our instance data. There is very little to RDF apart from this idea, ie. that simple stilted 3-part statements of the form: {peter, likes, mary} {peter, age, 7} {mary, livesIn, London} {peter, faveColor, red} ...are more useful when disambiguated with unique identifiers. Which 'peter', which 'London' and so forth. We pay the price in verbosity, but when we move to URIs (eg. urn:xmeta:cities:canada:London or http://xmlns.com/cities/LondonUK) for these silly stilted sentences, there's another huge pay off: data aggregation. Since the RDF information model is just stilted 3-part sentences mostly built from URIs, we can aggregate two RDF data graphs by joining nodes that share common identifiers. if one piece of data tells us: (I'm switching to an ascii-art labelled graph representation here) [mary] --livesIn--> [London] [mary] --age--> "9" [peter] --livesNextDoorTo-->[mary] and something else (say the CIA world fact book or X500) informs us that... [London] --numCommunists--> "10,000" [London] --situatedIn--> [Canada] we can simply[*] join these two graphs on the common node London (or, rather, the unambiguous version ie [urn:xmeta:cities:canada:London]. Whether this is 'data' or 'metadata' is of no interest to me whatsoever. Using URIs for Web data is just downright handy. We can take heaps of silly 3-part sentences from anywhere (that we trust...) on the Web, pour them into a common database and get something mostly intelligible. Here's a bald claim: Aggregating unanticipated RDF data graphs into a useful common data structure is a feasible task; doing the same with unanticipated non-RDF XML data is, in the general case, much harder. Maybe I'm wrong; perhaps someone has an algorithm for general purpose DOM-merging or SAX-stream aggregation that doesn't mangle data. If anyone has seen such a thing please post the URL... (BTW I'm making loose use of undefined terms here. By 'unanticipated' I'm talking about a processor encountering instance data in a previously unseen vocabulary. By 'aggregation' I mean joining together relevant facts (or would-be facts) scattered across various XML documents and document-parts such that applications can make use of the pooled information.) Let me emphasise that I'm not focussing on the use of RDF syntax here; that doesn't matter. The key thing IMHO to support Web data aggregation is for interchanged data to have a common URI-based graph interpretation. We can do that with XSL or (hopefully) using annotations in XML Schemata or annotations on good old fashioned DTDs. RDF is URIs URIs URIs and not a lot else. I'm willing to be persuaded that the syntax needs more thought, but the value of using unique identifers in Web data interchange seems pretty uncontroversial... Dan [*] I'm glossing over some issues here (eg. relating to knowledge of cardinality/occurrence constraints to aid data aggregation apps); aggregation in RDF is still hard to do right, but is vastly easier than for arbitrary XML content. -- daniel.brickley@bristol.ac.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwtodd at pacbell.net Thu Dec 2 03:42:09 1999 From: jwtodd at pacbell.net (James Todd) Date: Mon Jun 7 17:18:11 2004 Subject: q: i'd like to merge two docs ... Message-ID: <3845EAFA.341C3122@pacbell.net> hi - i could use a pointer or two, a recipe if you will, on how best to "modify and merge" two xml docs. the scenario: an inbound xml "fragment", a complete xml doc in it's own right, is amended (eg. one new attribute is added) the results of which is appended, as a child node, to a "hosting" xml tree i've got most of this working using the ProjectX [? Mr. Brownell ?] parser yet it fails during the appendChild() stating that the child node "That node doesn't belong in this document" due to the fact, i believe, that it has a distinct OwnerDocument. my methodology to date is to create dom's for both the inbound "fragment" and the destination xml docs afterwhich i'd like to modify the fragment (hence going the dom route) and finally add the results to the destination doc via appendChild(). i had hoped to bypass walking the tree in order to create an "document ownerless" copy with which to work with. is there a better/preferred means by which to accomplish this task? any/all comments and suggestions welcomed. thx much, - james xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From orchard at pacificspirit.com Thu Dec 2 04:02:28 1999 From: orchard at pacificspirit.com (David Orchard) Date: Mon Jun 7 17:18:11 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201154526.014c3710@pop.intergate.ca> Message-ID: <000401bf3c7a$064c3ec0$e930e620@n54wntw.vancouver.can.ibm.com> As well, I fall into the context is king category, not content. The best metric we have for that is company market caps and revenues. TV Guide makes more than CBS, NBC, ABC and Fox put together. Yahoo wins because the context is human created rather than generated. Human context or Point of View is always more usable to humans than machine POV. If you argue that Point of View and Context are actually content not metadata, then there's no such thing as metadata. It's all just data. Which is what I actually believe. Cheers, Dave Orchard > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Tim Bray > Sent: Wednesday, December 01, 1999 3:47 PM > To: Walter Underwood; 'xml-dev@ic.ac.uk' > Subject: RE: Some questions > > > At 02:02 PM 12/1/99 -0800, Walter Underwood wrote: > >I disagree on this one. It's rare that metacontent is more > >valuable than content, long-term. I'll bet on the books over > >the card catalog, every time. > > Wow, that's a profoundly deep and strong statement, and I think at the > core of the argument that *should* be happening about how to make the > Web a better place. In fairness, it should be said that Walter works for > a company whose search engine does the equivalent of reading all the > pages of all the books on all the shelves, and trying to guess what the > books mean. I used to be in that business myself. > > But I think metadata wins. If you count hits on Internet search engines, > the Yahoo and ODP directories, which are both human-constructed metadata, > absolutely wipe out any fulltext search engine you can name, even though > they have orders of magnitude less sites and a lower volume of information > about each. Because human-constructed metadata wins on the net just like > it did in the library. > > RDF is important because it can facilitate the interchange of, and > a certain number of the common uses of, this kind of metadata. > > Anyhow, just because you have a card catalogue doesn't mean you throw > the library books away. -Tim > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smuench at us.oracle.com Thu Dec 2 04:44:15 1999 From: smuench at us.oracle.com (Steve Muench) Date: Mon Jun 7 17:18:11 2004 Subject: i'd like to merge two docs ... References: <3845EAFA.341C3122@pacbell.net> Message-ID: <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com> Assuming you have XML DOM Documents "one" and "two" and that "oneElement" is the element in doc "one" to which you'd like to append the entire content of "two"... You should be able to do: Element twoDocElt = two.getDocumentElement(); two.removeChild(twoDocElt); oneElement.appendChild(twoDocElt); _________________________________________________________ Steve Muench, Consulting Product Manager & XML Evangelist Business Components for Java Development Team http://technet.oracle.com/tech/java http://technet.oracle.com/tech/xml ----- Original Message ----- From: James Todd To: Sent: Wednesday, December 01, 1999 9:43 PM Subject: q: i'd like to merge two docs ... | | hi - | | i could use a pointer or two, a recipe if you will, on how best to | "modify and merge" two xml docs. the scenario: | | an inbound xml "fragment", a complete xml doc in it's own | right, is amended (eg. one new attribute is added) | | the results of which is appended, as a child node, to a | "hosting" xml tree | | i've got most of this working using the ProjectX [? Mr. Brownell ?] | parser yet it fails during the appendChild() stating that the child | node | | "That node doesn't belong in this document" | | due to the fact, i believe, that it has a distinct OwnerDocument. | | my methodology to date is to create dom's for both the inbound | "fragment" and the destination xml docs afterwhich i'd like to | modify | the fragment (hence going the dom route) and finally add the results | | to the destination doc via appendChild(). | | i had hoped to bypass walking the tree in order to create an | "document ownerless" copy with which to work with. is there | a better/preferred means by which to accomplish this task? | | any/all comments and suggestions welcomed. | | thx much, | | - james | | | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk | Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 | To unsubscribe, mailto:majordomo@ic.ac.uk the following message; | unsubscribe xml-dev | To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; | subscribe xml-dev-digest | List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) | | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwtodd at pacbell.net Thu Dec 2 06:46:19 1999 From: jwtodd at pacbell.net (James Todd) Date: Mon Jun 7 17:18:11 2004 Subject: i'd like to merge two docs ... References: <3845EAFA.341C3122@pacbell.net> <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com> Message-ID: <38461554.61917EBD@pacbell.net> hmmm ... this is pretty much what i did with the exception of the "removeChild()" step. my interpretation of this is that the removeChild step will disassociate/null the OwnerDocument so that it is effectively orphaned and can be added into the new hosting doc. i'll give it a whirl. thx much, - james Steve Muench wrote: > Assuming you have XML DOM Documents "one" and "two" > and that "oneElement" is the element in doc "one" > to which you'd like to append the entire content > of "two"... > > You should be able to do: > > Element twoDocElt = two.getDocumentElement(); > two.removeChild(twoDocElt); > oneElement.appendChild(twoDocElt); > > _________________________________________________________ > Steve Muench, Consulting Product Manager & XML Evangelist > Business Components for Java Development Team > http://technet.oracle.com/tech/java > http://technet.oracle.com/tech/xml > ----- Original Message ----- > From: James Todd > To: > Sent: Wednesday, December 01, 1999 9:43 PM > Subject: q: i'd like to merge two docs ... > > | > | hi - > | > | i could use a pointer or two, a recipe if you will, on how best to > | "modify and merge" two xml docs. the scenario: > | > | an inbound xml "fragment", a complete xml doc in it's own > | right, is amended (eg. one new attribute is added) > | > | the results of which is appended, as a child node, to a > | "hosting" xml tree > | > | i've got most of this working using the ProjectX [? Mr. Brownell ?] > | parser yet it fails during the appendChild() stating that the child > | node > | > | "That node doesn't belong in this document" > | > | due to the fact, i believe, that it has a distinct OwnerDocument. > | > | my methodology to date is to create dom's for both the inbound > | "fragment" and the destination xml docs afterwhich i'd like to > | modify > | the fragment (hence going the dom route) and finally add the results > | > | to the destination doc via appendChild(). > | > | i had hoped to bypass walking the tree in order to create an > | "document ownerless" copy with which to work with. is there > | a better/preferred means by which to accomplish this task? > | > | any/all comments and suggestions welcomed. > | > | thx much, > | > | - james > | > | > | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > | Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > | To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > | unsubscribe xml-dev > | To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > | subscribe xml-dev-digest > | List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > | > | > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mikew at o3.co.uk Thu Dec 2 08:08:51 1999 From: mikew at o3.co.uk (Mike Williams) Date: Mon Jun 7 17:18:11 2004 Subject: Storing SAX Locator information in DOM tree Message-ID: I'm pre-parsing some XML-based web-page templates, and building them into DOM Documents. My template-processor takes Document+data as input, and generates SAX events. My problem is this: if I detect an error while processing the template (expected tags are missing, etc.), I have no way of relating this to a position in the original template-file. This would obviously be useful for my template-authors, so they don't have to re-check entire templates. I'd really like to store information against each node in the Document, recording what file it was built from, and where (line/column) the node started; the SAX Locator information, basically. Reasonable? Is there any way to implement this? -- Mike Williams xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Thu Dec 2 09:06:12 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:11 2004 Subject: RDF vs. standard vocabularies (Was Re: Some questions) References: <3.0.32.19991201154632.0152f350@pop.intergate.ca> Message-ID: <384635F2.7E975E13@praxis.cz> Tim Bray wrote: > Because the same data structures and usage patterns keep coming back across > wide ranges of metadata applications, even though the world isn't about > to agree on common vocabularies. So there are huge gains to be had from > a common data model and transfer syntax. -Tim But aren't common vocabularies needed for RDF as well? Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Thu Dec 2 09:26:30 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:11 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> Message-ID: <38463AB4.36C5292B@praxis.cz> David Megginson wrote: > Perhaps I'm a little confused, but I cannot see how the fact that a > schema language itself happens to be object oriented allows you to do > object exchange in XML (it doesn't hurt, but how can it help?). > > I've been confused before, so there's no need for panic. I'm not entirely sure whether or not I am confused myself. Let me give this a crack, and I'm sure someone will be happy to tell me why I am wrong. :-) Let's say I have an arbitrary object structure that I want to serialize and send down the pipe. Serializing a bunch of object attributes in XML is a no-brainer, and representing arbitrary references between objects is also fairly trivial if something like XLink is used (and we need XLink, there's surely no controversy about this). The aspects of object-oriented design that are missing are then inheritance and polymorphism. This is why an object-oriented schema language is needed: to do this properly I should be mapping each of my object classes to a specific element type, and I need to be able to say that a given element type extends a base type if this type of relationship is present in my original object schema. Rich data types are also needed although this doesn't have to do with object-orientation per se. Polymorphism is about behavior and should be implemented in schema-aware tools. I honestly feel that XML provides all the tools to do what RDF is trying to do, without an additional syntactic layer. What is missing from the picture is a mechanism for modelling object structures according to object-oriented principles, and this is why an OO schema language is necessary. The only other thing the RDF brings to the game is that it turns relationships into first-class objects that can be referenced as well, but I don't know any OO language that enables this without modelling it specifically (i.e. creating an object to represent a reference), and this can be done in an analogous way in XML as well. If I may, let me turn your question on its head: what about an XML Schema approach doesn't let you do object exchange in a satisfactory manner? Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 11:31:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:11 2004 Subject: Some questions In-Reply-To: "David Orchard"'s message of "Wed, 1 Dec 1999 20:02:17 -0800" References: <000401bf3c7a$064c3ec0$e930e620@n54wntw.vancouver.can.ibm.com> Message-ID: "David Orchard" writes: > As well, I fall into the context is king category, not content. > > The best metric we have for that is company market caps and > revenues. TV Guide makes more than CBS, NBC, ABC and Fox put > together. > > Yahoo wins because the context is human created rather than > generated. Human context or Point of View is always more usable to > humans than machine POV. > > If you argue that Point of View and Context are actually content not > metadata, then there's no such thing as metadata. It's all just > data. Which is what I actually believe. Yup, me too -- I'm not a big fan of the "meta" word. After all, many companies have information about me in their databases, but they don't have me myself in them -- does that mean that all of their data are "metadata" as well? If so, then what are just plain data? BTW, RDF itself is being very heavily and successfully used in the Linux world right now -- it's the basis of the database for rpmfind, a utility that allows users to find new packages or upgrade existing ones, including dependencies. Of course, end users never have to see the RDF (they can look if they want to), but that's the way it should be. All the best, David p.s. The RDF used by rpmfind is not strictly conformant, since it uses a pre-REC version of the RDF Namespace URI, but otherwise it's fully correct. -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 11:44:29 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:12 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Matthew Gertner's message of "Thu, 02 Dec 1999 10:24:04 +0100" References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> <38463AB4.36C5292B@praxis.cz> Message-ID: Matthew Gertner writes: > I honestly feel that XML provides all the tools to do what RDF is trying > to do, without an additional syntactic layer. What is missing from the > picture is a mechanism for modelling object structures according to > object-oriented principles, and this is why an OO schema language is > necessary. If you have a function loadXML(), you get a DOM tree or a bunch of SAX events or something similar; if you have a function loadRDF(), you get a collection of objects with attributes and relationships. In either case, a schema can tell you things like "element type/class B is a kind of element type/class A", but that's secondary information; the primary information is "element X is an object of class Y with identifier Z, while element A represents a relationship between this object and object C". If you're interested in a collection of objects in the first place, why should you have to see or know about XML elements and attributes at all? Or to put it a different way, why should people constantly have to redo the work of extracting objects from XML, when they're all trying to do the same thing? I think that reasonable people can argue that RDF is not the best solution to the problem of object exchange in XML, but I am somewhat surprised to hear people deny that the problem even exists: there is an enormous demand for exchanging objects in XML (businesses exchange a lot of structured data), and it's hard work to have to figure out over and over how to construct objects from a SAX stream or a DOM tree especially when programmers with XML knowledge are scarce and expensive. I have no doubt that we need an abstract object layer on top of XML. Right now, RDF is the best solution currently available (XMI also has its advocates), but I'm ready to listen about anything better. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From steven.livingstone at scotent.co.uk Thu Dec 2 11:54:43 1999 From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM) Date: Mon Jun 7 17:18:12 2004 Subject: Schema Question Message-ID: <8DCB90532FF7D211B34400805FD48853B363F5@SENMAIL3> Hi all - I've got a question on X-Schema for anyone who may be of help. I am creating an XML Schema for an XML document which is to dynamically generated. I am ok with most of it, but there is one particular part where I may have any number of elements, but with the same property. So i may have red green .. .. .. The element properties could be called anything, but have the same type of value. Is there a way to specify a variable for the element name, but set it's type, say, to string so that any element created under could be called anything but follow predetermined validation ? Beyond that, I will stick to the normal technique with validtion which isn't really a problem, but the other would/could be useful !? Cheers Steven Steven Livingstone - http://www.citix.com 07771 957 280 or +447771957280 Professional Site Server 3, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696 Professional Site Server 3.0 Commerce Edition, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Thu Dec 2 12:24:41 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:12 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> <38463AB4.36C5292B@praxis.cz> Message-ID: <38466487.328D1CFA@praxis.cz> David Megginson wrote: > If you have a function loadXML(), you get a DOM tree or a bunch of SAX > events or something similar; if you have a function loadRDF(), you get > a collection of objects with attributes and relationships. In either > case, a schema can tell you things like "element type/class B is a > kind of element type/class A", but that's secondary information; the > primary information is "element X is an object of class Y with > identifier Z, while element A represents a relationship between this > object and object C". A schema gives you this information too. The problem of how to attach a schema to an instance is not yet resolved, but it is a purely syntactic consideration and a satisfactory solution will be found. This then tells you what class a given instance belongs too. The identity can be specified using an ID attribute; this is exactly the way it is done in RDF. That an element represents a relationship is implicit in the content model of the element. SAX is great as far as it goes, but we seem to be agreeing that an additional layer is needed on top. This layer is not the DOM. One of the lessons that I learned from my time at POET Software is that, although we had an excellent generic API, the vast majority of our customers wanted to work with real C++ (and later Java) classes in their problem domain. But there is nothing to say that a loadXML() function must return a DOM tree. There are a variety of efforts to create domain-specific objects automatically from XML objects. I don't have a list at the tips of my fingers, but if anyone does it would be a great resource. They are out there because I keep bumping into them. > If you're interested in a collection of objects in the first place, > why should you have to see or know about XML elements and attributes > at all? Or to put it a different way, why should people constantly > have to redo the work of extracting objects from XML, when they're all > trying to do the same thing? Once again, there are already tools that provide this functionality across applications (i.e. they can be plugged in and used without additional development). The interest of XML is essentially as a way to serialize objects and send them across a network, as you also stated. > I think that reasonable people can argue that RDF is not the best > solution to the problem of object exchange in XML, but I am somewhat > surprised to hear people deny that the problem even exists: there is > an enormous demand for exchanging objects in XML (businesses exchange > a lot of structured data), and it's hard work to have to figure out > over and over how to construct objects from a SAX stream or a DOM tree > especially when programmers with XML knowledge are scarce and > expensive. > > I have no doubt that we need an abstract object layer on top of XML. > Right now, RDF is the best solution currently available (XMI also has > its advocates), but I'm ready to listen about anything better. In no way do I doubt the importance of being able to exchange objects in XML, but I do have serious reservations about RDF as the way to do this, and they have nothing to do with the hairy syntax or hard-to-understand spec. What is lacking right now is an overarching approach to using XML in real-world applications. To be quite blunt it seems ashame that a lot of really great work is being put into the RDF effort (including a very valuable vocabulary for collection classes, just to name one) instead of being integrated more tightly into the overall XML architecture. This is especially so because there isn't an overall XML architecture yet, and the effort and thought that are being put into RDF could bring us a long way towards this. I don't agree with the conclusions of the Cambridge Communique. I think that if the work being done on RDF were refocused to making sure that XML Schemas do everything that the RDF advocates are rightly claiming is necessary, that we will see a clear win in terms of pushing the whole XML effort from a theoretical effort into a major paradigm shift with extensive real-world implications. As things stand, this work is being diluted because both we are asking people to read about, grasp and implement two things instead of just one. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Dec 2 12:40:02 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:12 2004 Subject: Some questions In-Reply-To: Message-ID: Hi David, David said: Or, in programming terms, the ID is local rather than global, or in Web terms, it is relative rather than absolute (note that RDF allows ID as well). That's suitable for some applications, but entirely useless for others (it's often important to have single global identifiers for well-known people, places, and things). Didier reply: But most of the RDF users set the description element "about" attribute's value with a URL. In fact, this is OK because the spec indicates that you are providing a description _about_ something and the about value may be its location. I discovered that using this form, is, most of the time bogus. Instead, I do what librarian discovered. Have the classification card (i.e. description element) to be independent of any properties. So, instead of using a location in the description element, I use instead an ID. This mainly because the object's location _is_ a property. So, the object's location is indicated by a "location" property. If there is no location I do not include a "location" property. See, this is very different. The object, this time is a collection of properties. The description itself is uniquely identified in a description collection by an ID (so that, if this is needed, we can relate a description to an other). I do not use a URL as a value for the about and I tend not to use the about attribute but instead use the "id" attribute and include the location as a property in the description. So, now, the real challenge for data interchange is to agree on a particular schema or property set. Otherwise we only exchange data with our own tools :-) Cheers Didier PH Martin ---------------------------------------------- Email: martind@netfolder.com Conferences: Web Boston (http://www.mfweb.com) Markup 99 (http://www.gca.com) Book to come soon: XML Pro published by Wrox Press Products http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Dec 2 12:50:12 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:12 2004 Subject: Some questions In-Reply-To: <000801bf3c5b$06a06a50$0f36a8c0@quokka.com> Message-ID: Hi Jeff, Jeff said: Well, you need both. You need the shared concept of "author" and the shared representation of an instance of that concept. XML specs of various kinds are trying to define shared representations at various semantic layers. Both vertical and horizontal vocabulary efforts (Dublin Core, BizTalk, etc.) are required to complete the equation. Jeff P.S. Please don't bash me for mentioning BizTalk. It was an arbitrary example. Didier reply: I won't bash you but will precise that, the biztalk framework is more an envelope used to transport you document. In that sense, your document is transformed into a biztalk document's fragment. If we look closely enough, a biztalk framework is a collection of meta data about an XML document. Meta data like: a) from whom/what is this document coming from? b) to whom/what is this document sent to? c) what is the purpose of this document? d) To which process is this document part of? So, a biztalk document is a set of meta data properties and your off course includes your document now transformed into a biztalk document's fragment. A biztalk document has about the same structure as an HTML document. header part or meta data part body part - this is where you insert your document This is roughly equivalent to an HTML document structure: ... your headers here including meta data ...the HTML document body Something interesting to note here, the meta data are mainly "routing" meta data as you would find in workflow engines. Is a biztalk server a workflow engine? Does Microsoft now want to enter in the workflow business? I let you make your own conclusions. PS: my outlook spell checker still wants to replace the "biztalk" word by the "bestial" word :-))) Cheers Didier PH Martin ---------------------------------------------- Email: martind@netfolder.com Conferences: Web Boston (http://www.mfweb.com) Markup 99 (http://www.gca.com) Book to come soon: XML Pro published by Wrox Press Products http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Dec 2 13:55:20 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:12 2004 Subject: XML processing instruction survey Message-ID: <013401bf3cd0$660d6040$3ff96d8c@NT.JELLIFFE.COM.AU> From: Jeffrey E. Sussna >I'm interested in the extent to which people are actually using the XML >processing instruction ( they find it useful. You probably should post this question (in Japanese) to a Japanese XML mailgroup, or (in Chinese) to a Chinese XML mail group (such as the one running from University of Milan), and so on. Asking an English-language list will only give your a survey of how many people work outside their only language: I am interested in this, but a lack of response would not provide evidence of anything much. Also, when dealing with CJK societies, with their strong Buddhist aversion to self-promotion coupled with a strong Confucian deference to authority (let alone the strong reluctence to embarrass them selves or others, or put themselves into conflict), you might be hard-pressed to get much response even there. For me, I use it every day. See http://www.ascc.net/xml for a bilingual website in UTF-8, Big5, and GB2312. Logfiles reveal that Chinese is accessed primarily through Big5 or GB. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 14:27:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:12 2004 Subject: Some questions In-Reply-To: References: Message-ID: <14406.33170.34794.500249@localhost.localdomain> Didier PH Martin writes: > Didier reply: > But most of the RDF users set the description element "about" > attribute's value with a URL. In fact, this is OK because the spec > indicates that you are providing a description _about_ something > and the about value may be its location. The advantage is that the URL is an absolute identifier (whether it actually points to anything or not). For example, imagine that Amazon.com uses id p0809764 to refer to the person David Bowie, while Reuters uses the id p0809764 to refer to the person Bill Clinton. If I get some RDF 80% how do I know who I'm talking about? On the other hand, if I have 61% 80% then there's no room for confusion. Certainly, local IDs have their uses, but we're building a new environment where information has to be useful across systems, and to accomplish that, we need to use some kind of global identifiers, such as URLs or URNs (once the latter are ready for Prime Time); local IDs are of little value outside of closed systems. > I discovered that using this form, is, most of the time > bogus. Instead, I do what librarian discovered. Have the > classification card (i.e. description element) to be independent of > any properties. Yes, but with the Web, a better analogy would be that you're in Robarts Library in Toronto and have a card from the Bodleian in Oxford that says to get the third book on the fifth shelf in the eighteenth row. It would have been better to have given you the ISBN so that you could find it in any library. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 14:40:46 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:12 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Matthew Gertner's message of "Thu, 02 Dec 1999 13:22:31 +0100" References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> <38463AB4.36C5292B@praxis.cz> <38466487.328D1CFA@praxis.cz> Message-ID: Matthew Gertner writes: > A schema gives you this information too. The problem of how to attach a > schema to an instance is not yet resolved, but it is a purely syntactic > consideration and a satisfactory solution will be found. This then tells > you what class a given instance belongs too. The identity can be > specified using an ID attribute; this is exactly the way it is done in > RDF. That an element represents a relationship is implicit in the > content model of the element. I still don't follow. Perhaps I need to reread the XML Schema spec, but given David Megginson How does the schema tell me that foo represents a container for a collection of objects, bar represents an object, and hack and flurb represent the object's properties? > SAX is great as far as it goes, but we seem to be agreeing that an > additional layer is needed on top. This layer is not the DOM. It can be. The DOM represents a domain-specific object layer that is useful for a wide subset of XML operations (especially document- and browser-oriented work). There need to be many layers on top of XML, one for each domain -- it happens that many of those layers will share the need to encode objects, so a standard object layer sandwiched between XML and the domain-specific layers can save a lot of work. > There are a variety of efforts to create > domain-specific objects automatically from XML objects. I don't have a > list at the tips of my fingers, but if anyone does it would be a great > resource. They are out there because I keep bumping into them. One example is RDF. > To be quite blunt it seems ashame that a lot of really great work is > being put into the RDF effort (including a very valuable vocabulary > for collection classes, just to name one) instead of being > integrated more tightly into the overall XML architecture. I disagree strongly with the last part of that statement. I'd argue the opposite -- higher-level layers should be as independent of XML as possible. That's the only way to build good, layered architectures. XML does one thing (represent a tree structure in a character stream) very well: it's an excellent layer to build other layers on top of, but XML itself should stay as simple as possible so that it's applicable widely to many different fields. > I think that if the work being done on RDF were refocused to making > sure that XML Schemas do everything that the RDF advocates are > rightly claiming is necessary, that we will see a clear win in terms > of pushing the whole XML effort from a theoretical effort into a > major paradigm shift with extensive real-world implications. That would be another serious mistake. Object exchange, while important, represents only one of many layers that can be build on top of XML, and if XML Schemas start trying to solve high-level problems for every specific domain, it will become an unimplementable mess. RDF already made a similar mistake by mixing together a spec for object encoding in XML with a spec for representing knowledge about Web pages. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LWatanab at JetForm.com Thu Dec 2 14:51:31 1999 From: LWatanab at JetForm.com (Larry Watanabe) Date: Mon Jun 7 17:18:12 2004 Subject: Some questions Message-ID: <111CF63B7D2ED211830000805F65A2FF01804962@OTTMAIL2> Eve Maler wrote > Not that I don't respect RDF's power, but personally, I think the key *is* > common vocabularies. We may have to start small, and they may just be hub > formats that get mapped to/from a lot, but agreeing on semantics is the > pill that has to be swallowed. Even RDF depends on this, particularly on > an open system such as the Web where you can't really control or influence > the habits of content creators. If you want to indicate that you are the > author of a certain page, at the very least you have to refer to a widely > understood "author" semantic in order for author-criterion searching to be > of any use to your audience. Whether it's an RDF property or a well-known > namespace or whatever doesn't seem to matter as much. I agree; if someone chooses to define "author" to be what someone else uses for "garage mechanic" then there is no advantage to common syntax. Even assuming we rely on common English usages, there are multiple representations and arbitrary decisions in mapping English to logic (which RDF is a disguised form of). For example, suppose we want to represent "John loves Mary". We could represent this as the triple {John, loves, Mary} or it could be represented as {person001, loves, person002} {person001, name, John} {person002, name, Mary} both correctly represent the statement in RDF triples. It would be advantageous to have a common repository of vocabularies, so that people would agree on meanings and syntax (i.e. do we use love or Loves or LUV} and is the first person the lover or the lovee, etc. This would serve a similar function to a namespace declaration, but would deal with the semantics of the expressions. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Dec 2 15:01:54 1999 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:18:12 2004 Subject: CFP: W3C XML Activity chat before XML '99 Message-ID: You may have seen: "Upcoming Events: XML? 99, 5-9 Dec '99 in Philadelphia (a GCA Conference) meet Judy Brewer, Bert Bos, Dan Connolly, Dave Raggett, Joseph Reagle, Chris Lilley, Michael Sperberg-McQueen from the W3C Team " -- http://www.w3.org/XML/ and, meanwhile, a lot of discussion in xml-dev and elsewhere about what W3C is doing with XML (and HTML and ...) and how and why it does all this stuff. I propose we get together over an IRC channel and chat: Who: everybody's welcome In addition to myself, Bert Bos, Ian Jacobs, Henry Thompson, and Daniel Veillard from the W3C Team plan to be there. When: Friday, 3 December at 1500Z (9am U.S. Central time) for about an hour. (Apologies to the parts of the world where that's inconvenient. The log will go online, and hopefully we'll have more chats at different times of day in the future.) Where: irc://irc.openprojects.net/#w3c i.e. channel #w3c on irc.openprojects.net about this IRC network, see Open Projects Network - New User? http://openprojects.nu/about.html stay tuned to the XML home page http://www.w3.org/XML/ for other details. What: The W3C XML Activity: Who, What, How, and Why Recommended reading: W3C Extensible Markup Language (XML) Activity http://www.w3.org/XML/Activity HTML Working Group Roadmap 18 November 1999, Shane McCarron, Dave Ragett http://www.w3.org/TR/xhtml-roadmap Schemas coming of age: use them Tim Berners-Lee (timbl@w3.org) Tue, 9 Nov 1999 15:31:59 -0500 http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/0249.html Web Architecture from 50,000 feet http://www.w3.org/DesignIssues/Architecture Web Architecture: Describing and Exchanging Data W3C Note 7 June 1999 http://www.w3.org/1999/04/WebData Web Architecture: Extensible Languages 10 Februray 1998, Tim Berners-Lee, Dan Connolly http://www.w3.org/TR/NOTE-webarch-extlang xml-dev archives http://www.lists.ic.ac.uk/hypermail/xml-dev/ and news:comp.text.xml ht, on behalf of Dan Connolly, W3C http://www.w3.org/People/Connolly/ -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sophie.Mabilat at apitech.fr Thu Dec 2 15:35:06 1999 From: Sophie.Mabilat at apitech.fr (Sophie MABILAT) Date: Mon Jun 7 17:18:12 2004 Subject: DTDs and Schemas... Message-ID: Does anyone know a tool which converts DTDs into Schemas and Schemas into DTDs ? ------------------------------------------------------------------- Sophie MABILAT Sophie.Mabilat@apitech.fr ------------------------------------------------------------------- APITECH 113, rue Marietton 69009 Lyon FRANCE Tél. : 04 78 43 49 30 Fax : 04 78 83 47 86 ------------------------------------------------------------------- www.zipbee.com ------------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dhunter at Mobility.com Thu Dec 2 15:56:04 1999 From: dhunter at Mobility.com (Hunter, David) Date: Mon Jun 7 17:18:12 2004 Subject: XML processing instruction survey Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC0145@cc20exch2.mobility.com> From: Tim Bray [mailto:tbray@textuality.com] Sent: Tuesday, November 30, 1999 9:57 PM > > It's not really designed for people. It's mostly designed for use > by the XML processor to help figure out the encoding and make > sure that > this is really XML. > > I'd think that using it at the application level would be not only > uncommon but probably unwise. I'd be interested to hear any positive > responses to the query. -T. As would I. I'm currently writing YAXB (Yet Another XML Book), and I'm finding myself hard-pressed to come up with intelligent examples of where PIs might be useful. The XML Declaration I have no problem with, but PIs... David Hunter david.hunter@mobileq.com http://www.MobileQ.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smohr at voicenet.com Thu Dec 2 16:02:28 1999 From: smohr at voicenet.com (Stephen T. Mohr) Date: Mon Jun 7 17:18:12 2004 Subject: DTDs and Schemas... References: Message-ID: <01f901bf3cde$582c5590$e9d9f2cc@omicron.com> Extensibility's XML Authority will convert a DTD to a schema, but it's compliance with the W3C XML Schema draft is necessarily a bit dated. It will also export a DTD to an XML-DR (i.e., Microsoft schema preview) schema. ----- Original Message ----- From: Sophie MABILAT To: ; Sent: Thursday, 2 December 1999 10:22 Subject: DTDs and Schemas... > Does anyone know a tool which converts DTDs into Schemas and Schemas into > DTDs ? > > ------------------------------------------------------------------- > Sophie MABILAT > Sophie.Mabilat@apitech.fr > ------------------------------------------------------------------- > APITECH 113, rue Marietton 69009 Lyon FRANCE > T?l. : 04 78 43 49 30 Fax : 04 78 83 47 86 > ------------------------------------------------------------------- > www.zipbee.com > ------------------------------------------------------------------- > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cox_andy at bah.com Thu Dec 2 16:13:43 1999 From: cox_andy at bah.com (Cox Andy) Date: Mon Jun 7 17:18:13 2004 Subject: XML processing instruction survey In-Reply-To: <805C62F55FFAD1118D0800805FBB428D02BC0145@cc20exch2.mobility.com> Message-ID: <001a01bf3ce0$7dae9ec0$20aa509c@bah.com> One example of "real-world" PI usage can be found in the W3C Recommendation "Associating Style Sheets with XML documents" [1]. I have also seen them used in the Apache Cocoon project [2]. Andy [1] http://www.w3.org/TR/xml-stylesheet/ [2] http://java.apache.org/cocoon/ (soon http://xml.apache.org/cocoon) > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Hunter, David > Sent: Thursday, 02 December 1999 10:56 AM > To: 'XML Dev' > Subject: RE: XML processing instruction survey > > > From: Tim Bray [mailto:tbray@textuality.com] > Sent: Tuesday, November 30, 1999 9:57 PM > > > > It's not really designed for people. It's mostly designed for use > > by the XML processor to help figure out the encoding and make > > sure that > > this is really XML. > > > > I'd think that using it at the application level would be not only > > uncommon but probably unwise. I'd be interested to hear any positive > > responses to the query. -T. > > As would I. I'm currently writing YAXB (Yet Another XML Book), and I'm > finding myself hard-pressed to come up with intelligent examples of where > PIs might be useful. > > The XML Declaration I have no problem with, but PIs... > > David Hunter > david.hunter@mobileq.com > http://www.MobileQ.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Thu Dec 2 16:36:39 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:13 2004 Subject: Schema Question Message-ID: <61DAD58E8F4ED211AC8400A0C9B4687341553B@THOR> Steven Livingstone wrote >> >>red >>green >>Is there a way to specify a variable for the element name, but set it's >>type, say, to string so that any element created under >>could be called anything but follow predetermined validation ? The concepts of archetypes in the W3C Schema draft were motivated (at least in my interpretation) by the desire to do something like what you suggested. The classic would be to create an Address archetype and use it to define ShipTo and Billing elements that have the same content model. However, this does not allow a document author to make up a new element name and have the parser mystically figure out it should be an address (or whatever) and validate it. The list of all the acceptible elements must be enumerated in the schema. (I could be wrong my interpretation on this however). I guess if you consider an archetype as being an element without a name, you could allow an archetype to appear in a content model and then any child element could be validated against the content model. However, choosing between two potential archetypes (say in a choice of archetypes or a sequence with optional archetypes) may require you to look at their content to determine what archetype applies. I think this adds too much complexity to schema validation for its value. If you really want to do this (and to validate), I think that you let "element_properties" have any content and then use an XSLT (or something else) determine if the content of element_properties matches your pattern. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Dec 2 16:39:19 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:18:13 2004 Subject: Some questions References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> Message-ID: <3846A0A9.5D8AB94D@prescod.net> Matthew Gertner wrote: > > I can see the enormous interest in having a text-based format for > exchanging object-oriented data. But can't this be done with a good > object-oriented XML schema language, of the which the current W3C seems > to be a very good start? One of XML's few innovations was making the schema optional. People thought that was really important. If object representation requires a schema then we're back where we started -- at least in that domain. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Dec 2 16:44:44 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:18:13 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> <38463AB4.36C5292B@praxis.cz> <38466487.328D1CFA@praxis.cz> Message-ID: <3846A1E9.AFB43B7A@prescod.net> David Megginson wrote: > > How does the schema tell me that foo represents a container for a > collection of objects, bar represents an object, and hack and flurb > represent the object's properties? It probably doesn't, but Matthew is right that you could imagine a schema language that DOES > Object exchange, while > important, represents only one of many layers that can be build on top > of XML, and if XML Schemas start trying to solve high-level problems > for every specific domain, it will become an unimplementable mess. I would argue that every domain, including documents, has a concept of "objects" and a concept of "properties." XML's inability to represent this is, in my opinion, a major flaw. It would be nice if schemas could work around that flaw but I still think that there is a place in the world for an instance-only syntax for objects and properties. > RDF already made a similar mistake by mixing together a spec for > object encoding in XML with a spec for representing knowledge about > Web pages. I agree that this was a mistake and it befuddled me for a while. I see it as a different situation, however, because I can't imagine a problem domain that does NOT need to know about structured objects and their properties. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Dec 2 16:47:47 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:18:13 2004 Subject: Some questions References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> Message-ID: <3846A2A4.5B795FA4@prescod.net> David Megginson wrote: > > That's just the problem with the spec -- if you forget the word/prefix > "meta" completely, RDF is just an XML format for object exchange; it > just happens that one possible application of those objects if > metadata, and the RDF-Syntax spec mixes the two together. Agreed. The RDF spec also mixes syntax and data model (while claiming that the latter is independent of the former). I think that the data model is useful enough on its own (especially in light of the problems with RDF syntax) to deserve a separate spec. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Thu Dec 2 16:47:58 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:18:13 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <38459FA4.DAA00E35@praxis.cz> <38463AB4.36C5292B@praxis.cz> Message-ID: <38469260.E7D18FA3@jfinity.com> Matthew Gertner wrote: > Let's say I have an arbitrary object structure that I want to serialize > and send down the pipe. Serializing a bunch of object attributes in XML > is a no-brainer, and representing arbitrary references between objects > is also fairly trivial if something like XLink is used (and we need > XLink, there's surely no controversy about this). XLink is explicitly intended to support hyperlinking rather than linking, i.e. you have an instance level title on each object reference :-! RDF is explicitly intended to support linking rather than hyperlinking. You can specify a title for you object reference but you do it at the class level rather than the instance level. Even if you try to use XLink for OO linking you will find that you end up with the equivalent of void* pointers. Let's call these kinds of links properties of the source object. RDF allows you to specify the type of the property using a URI and (using RDF Schema) specify the base type of the property value. This is what you would expect to be able to do for strongly typed pointers in OO interchange. XLink doesn't even allow you to use a namespace qualified name for the "role" (this may have been fixed but it will be done as a new attribute value type like qname). It certainly doesn't touch being able to specify a type for the property value. The XML Schema group may end up supporting strongly typed references but I wouldn't be surprised if this fell off the plate. In short, RDF (and RDF Schema) support OO interchange in a pretty straightforward manner TODAY. David Megginson's work on the DATAX toolkit shows how straightforward it can be to use RDF. As part of my work at Rogue Wave, I participated in the development of several alternative C++/XML frameworks. We didn't use RDF because of the lack of tools. If I had to do it over again and choose between RDF + RDFSchema today and XML + XLink + XMLSchema tomorrow for OO interchange I know which way I would go. Cordially from Corvallis, Gabe Beged-Dov http://www.jfinity.com/gabe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Dec 2 16:55:36 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:18:13 2004 Subject: Some questions References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> <4.2.0.58.19991201175459.00baea50@abnaki> Message-ID: <3846A477.FEC9503C@prescod.net> "Eve L. Maler" wrote: > > When people talk about RDF, the "meta" part is what I have trouble with in > general. In what way is markup not metadata? In what way are element > content and attribute values not also metadata (depending on what you do > with them)? It feels weird for one particular data model to claim to have > cornered the metadata market. Here are definitions I use that are mostly free of the ambiguity people typically associate with the words content and metadata. Metadata is property/value oriented so that you can ask questions in terms of "what is the value of this property". Content is list within list oriented so that you can ask: "what comes before this item, and what comes after it." RDF data is content if you look at the XML level (because the XML data model doesn't make the element addressable as a property) but it is metadata if you look at the RDF level (because RDF really WOULD make the <TITLE> element addressable as a TITLE property). -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Thu Dec 2 17:16:55 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:18:13 2004 Subject: Any XML Schemas validators out yet ? Message-ID: <3846A9E7.B14067F1@toolsmiths.se> Hi All I have just started to write a new RPC using XML as content transfer and whant to use the new XML Schema proposal instead of DTD's. So Im wondering if there are any tools that can validate XML Schemas themselfs and maybe also validate XML documents using XML Schemas ? Regards Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From steven.livingstone at scotent.co.uk Thu Dec 2 17:26:08 1999 From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM) Date: Mon Jun 7 17:18:13 2004 Subject: Any XML Schemas validators out yet ? Message-ID: <8DCB90532FF7D211B34400805FD48853B56DBA@SENMAIL3> Yep, XML Authority from extensibility.com cheers Steven Steven Livingstone - http://www.deltabiz.com 07771 957 280 or +447771957280 Professional Site Server 3, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696 Professional Site Server 3.0 Commerce Edition, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505 > -----Original Message----- > From: Anders W. Tell [SMTP:anderst@toolsmiths.se] > Sent: 2 December 1999 17:19 > To: xml-dev@ic.ac.uk > Subject: Any XML Schemas validators out yet ? > > Hi All > > I have just started to write a new RPC using XML as content transfer and > whant to > use the new XML Schema proposal instead of DTD's. > > So Im wondering if there are any tools that can validate XML Schemas > themselfs > and maybe also validate XML documents using XML Schemas ? > > Regards Anders > -- > /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > / Financial Toolsmiths AB / > / Anders W. Tell / > /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 2 17:27:08 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:13 2004 Subject: Some questions In-Reply-To: <3.0.32.19991201154526.014c3710@pop.intergate.ca> Message-ID: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com> At 03:46 PM 12/1/99 -0800, Tim Bray wrote: > >But I think metadata wins. If you count hits on Internet search engines, >the Yahoo and ODP directories, which are both human-constructed metadata, >absolutely wipe out any fulltext search engine you can name, ... This is a subtle issue -- in aggregate, the metacontent is used more, but each user spends more time with content than with metacontent. The better the directory or search engine, the less time you need to spend with it (an interesting conflict when you are ad-supported). Here are stages of having the info (content) that you want, ordered in increasing amounts of wasted time. 1. I have the information. 2. I know where the information is. 3. I know it exists, but I don't know where it is. 4. I don't know if it exists. Only the last two need some sort of metacontent or finding aid. Organizing and indexing content is a time-saver, and sometimes that is essential. Sometimes, the metacontent has the whole answer (which companies sell rhinestone tiaras), but most people really want to buy the tiara. So I still put my money on Jane Austen or the OED over the card catalog. Heck, I'll put my money on Fanny Burney or 40,000 Words over the card catalog. On the other hand, I strongly agree that metacontent should be interchangable, both in syntax (RDF) and in data (e.g., AACR2 for author names). I just wish that the RDF spec was as clear as AACR2. wunder PS: My wife did need to buy a rhinestone tiara, and I was really impressed by the results for that search. Who would have guessed that the web has dozens of places to buy those? -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fscheng at netzero.net Thu Dec 2 17:27:38 1999 From: fscheng at netzero.net (Frank Biz) Date: Mon Jun 7 17:18:13 2004 Subject: How do embed carriage return/new line into the data? References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> Message-ID: <009601bf3cea$40e58260$644b8fcd@intervoice.com> I'm fairly new to this community. Please help me answer this very simple question. Thanks, Frank. ----- Original Message ----- From: "David Megginson" <david@megginson.com> To: <xml-dev@ic.ac.uk> Sent: Thursday, December 02, 1999 8:39 AM Subject: Re: Object-oriented serialization (Was Re: Some questions) > Matthew Gertner <matthew@praxis.cz> writes: > > > A schema gives you this information too. The problem of how to attach a > > schema to an instance is not yet resolved, but it is a purely syntactic > > consideration and a satisfactory solution will be found. This then tells > > you what class a given instance belongs too. The identity can be > > specified using an ID attribute; this is exactly the way it is done in > > RDF. That an element represents a relationship is implicit in the > > content model of the element. > > I still don't follow. Perhaps I need to reread the XML Schema spec, > but given > > <foo> > <bar id="xxx"> > <hack>David</hack> > <flurb>Megginson</flurb> > </bar> > </foo> > > How does the schema tell me that foo represents a container for a > collection of objects, bar represents an object, and hack and flurb > represent the object's properties? > > > SAX is great as far as it goes, but we seem to be agreeing that an > > additional layer is needed on top. This layer is not the DOM. > > It can be. The DOM represents a domain-specific object layer that is > useful for a wide subset of XML operations (especially document- and > browser-oriented work). There need to be many layers on top of XML, > one for each domain -- it happens that many of those layers will share > the need to encode objects, so a standard object layer sandwiched > between XML and the domain-specific layers can save a lot of work. > > > There are a variety of efforts to create > > domain-specific objects automatically from XML objects. I don't have a > > list at the tips of my fingers, but if anyone does it would be a great > > resource. They are out there because I keep bumping into them. > > One example is RDF. > > > To be quite blunt it seems ashame that a lot of really great work is > > being put into the RDF effort (including a very valuable vocabulary > > for collection classes, just to name one) instead of being > > integrated more tightly into the overall XML architecture. > > I disagree strongly with the last part of that statement. I'd argue > the opposite -- higher-level layers should be as independent of XML as > possible. That's the only way to build good, layered architectures. > XML does one thing (represent a tree structure in a character stream) > very well: it's an excellent layer to build other layers on top of, > but XML itself should stay as simple as possible so that it's > applicable widely to many different fields. > > > I think that if the work being done on RDF were refocused to making > > sure that XML Schemas do everything that the RDF advocates are > > rightly claiming is necessary, that we will see a clear win in terms > > of pushing the whole XML effort from a theoretical effort into a > > major paradigm shift with extensive real-world implications. > > That would be another serious mistake. Object exchange, while > important, represents only one of many layers that can be build on top > of XML, and if XML Schemas start trying to solve high-level problems > for every specific domain, it will become an unimplementable mess. > RDF already made a similar mistake by mixing together a spec for > object encoding in XML with a spec for representing knowledge about > Web pages. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fscheng at netzero.net Thu Dec 2 17:32:01 1999 From: fscheng at netzero.net (Franklin Cheng) Date: Mon Jun 7 17:18:13 2004 Subject: How to embed special characters (such as '<' , carriage return) into the data References: <3.0.32.19991130114807.01475710@pop.intergate.ca> Message-ID: <00b701bf3cea$f9efcf40$644b8fcd@intervoice.com> Thanks in advance. Frank. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 18:15:44 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:13 2004 Subject: Content or Metadata? In-Reply-To: Walter Underwood's message of "Thu, 02 Dec 1999 09:24:03 -0800" References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com> Message-ID: <m3bt898b1w.fsf@localhost.localdomain> Walter Underwood <wunder@infoseek.com> writes: > So I still put my money on Jane Austen or the OED over the > card catalog. Heck, I'll put my money on Fanny Burney or > 40,000 Words over the card catalog. The second example is an interesting choice. After all, the full OED would probably count as metadata to people who bother to make the distinction: it contains headwords and subheadwords, grammatical information, and definitions, but the bulk of the dictionary is made up of references to other printed works (word in context citations), just as the bulk of Yahoo! is made up of references to other Web sites. So, is the OED content or metadata? I dunno -- that's why I try to avoid the terms whenever I can. This is a long-standing problem though. In my former field, Medieval studies, there are numerous examples of originally marginal glosses and commentary (metadata?) becoming independently-distributed texts (content?). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From HZhou at HNTB.com Thu Dec 2 19:07:43 1999 From: HZhou at HNTB.com (Hao Zhou) Date: Mon Jun 7 17:18:13 2004 Subject: No subject Message-ID: <C623B85D4158D311A67D00805FEA6C794314CA@CBSEX1> unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Dec 2 19:14:59 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:13 2004 Subject: Content or Metadata? Message-ID: <3.0.32.19991202111153.0150e870@pop.intergate.ca> At 01:14 PM 12/2/99 -0500, David Megginson wrote: >The second example is an interesting choice. After all, the full OED >would probably count as metadata to people who bother to make the >distinction: These are murky waters. But there are a couple of things that are incontrovertably true: 1. All metadata is data. Given an aggregation of data items, each application can and will make its own decisions as to which is "data" and which "meta". Thus a common syntax for both, to the extent possible, is a good thing. 2. Not all data is metadata. Examples: this email message; Chopin's Nocturnes; Tuxedo.gif. Operationally, my experience suggests that in stuff that is not metadata, ordering matters. The converse is true; if ordering matters, it's probably not metadata. There are exceptions but you have to work pretty hard. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pandeng at telepath.com Thu Dec 2 19:20:42 1999 From: pandeng at telepath.com (Steve Schafer) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <m3bt898b1w.fsf@localhost.localdomain> References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com> <m3bt898b1w.fsf@localhost.localdomain> Message-ID: <3854c62c.84062872@90.0.0.40> On 02 Dec 1999 13:14:35 -0500, David Megginson <david@megginson.com>u wrote: >So, is the OED content or metadata? I dunno -- that's why I try to >avoid the terms whenever I can. It's all a matter of context, no? There exist an infinite number of levels, each one "meta" to the one immediately below it. -Steve Schafer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From spreitze at parc.xerox.com Thu Dec 2 19:42:27 1999 From: spreitze at parc.xerox.com (Mike Spreitzer) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca> Message-ID: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com> > Operationally, my experience suggests that in stuff that is > not metadata, ordering matters. The converse is true; if ordering matters, > it's probably not metadata. There are exceptions but you have to > work pretty hard. -Tim What about the list of authors of a scholarly paper? Isn't that metadata for which order matters? Not sweating yet, Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 19:52:05 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca> References: <3.0.32.19991202111153.0150e870@pop.intergate.ca> Message-ID: <14406.52640.332965.449524@localhost.localdomain> Tim Bray writes: > At 01:14 PM 12/2/99 -0500, David Megginson wrote: > >The second example is an interesting choice. After all, the full OED > >would probably count as metadata to people who bother to make the > >distinction: > > These are murky waters. But there are a couple of things that are > incontrovertably true: > > 1. All metadata is data. Given an aggregation of data items, each > application can and will make its own decisions as to which is "data" > and which "meta". Thus a common syntax for both, to the extent > possible, is a good thing. > 2. Not all data is metadata. Examples: this email message; Chopin's > Nocturnes; Tuxedo.gif. Hmm -- see below. > Operationally, my experience suggests that in stuff that is > not metadata, ordering matters. The converse is true; if ordering matters, > it's probably not metadata. There are exceptions but you have to > work pretty hard. -Tim How about ranked search results, or the top ten Web sites? I didn't really have to work that hard -- that's why RDF has the horrible kludge where the rdf:li property automatically changes into rdf:_1, rdf:_2, etc. Here's a trickier example: is a film review metadata or data? It's prose and it's ordered, but I'm reading it only because I'm interested in something else. I could even extend that to a picture of a tuxedo and beyond, but I'll spare the readers for now. The point is that the content/metadata distinction is not a property of the data but a property of how the data's actual use. If I use something for its own sake, it's content; if I use something for something else's sake, it's metadata. Tim is right that Chopin's Noctures are much more likely to be used for their own sake in most familiar contexts, but consider a collection of metadata about influences behind a musical piece: even there, there is no crisp line. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Thu Dec 2 19:56:57 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:18:14 2004 Subject: XML RPC Message-ID: <33D189919E89D311814C00805F1991F7F4A958@RED-MSG-08> RE: "I have just started to write a new RPC using XML as content transfer..." See also http://XMLRPC.com, http://XMLRPC.com and http://news.cnet.com/news/0-1003-200-1474298.html . Best wishes, Andrew Layman xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Thu Dec 2 19:57:15 1999 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? Message-ID: <51ED3F5356D8D011A0B1006097C3073401B1700B@martinique> That would be "paradata". (http://www.amazon.com/exec/obidos/ASIN/0521424062/qid=944164377/sr=1-1/102- 1972469-2427226) -gregg > -----Original Message----- > From: Mike Spreitzer [mailto:spreitze@parc.xerox.com] > Sent: Thursday, December 02, 1999 1:42 PM > > Operationally, my experience suggests that in stuff that is > > not metadata, ordering matters. The converse is true; if > ordering matters, > > it's probably not metadata. There are exceptions but you have to > > work pretty hard. -Tim > > What about the list of authors of a scholarly paper? Isn't > that metadata for which order > matters? > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rev-bob at gotc.com Thu Dec 2 20:05:05 1999 From: rev-bob at gotc.com (rev-bob@gotc.com) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? Message-ID: <199912021504116.SM01084@Unknown.> > > Operationally, my experience suggests that in stuff that is > > not metadata, ordering matters. The converse is true; if ordering matters, > > it's probably not metadata. There are exceptions but you have to > > work pretty hard. -Tim > > What about the list of authors of a scholarly paper? Isn't that metadata for which > order matters? Maybe it matters to them, but not to me. :) Look, it's like I say on my site (if you catch the randomizer just right) - reality is holographic. If you delve deep enough, any data you find will eventually serve as metadata for something else. For instance (one of my favorites), digging into the roots of the word "testify" will eventually indicate that Greco-Roman society was pretty patriarchal in nature, even to the point of codifying this bias in their legal structure. (The full chain of connections? "Testify" comes from the same lexical root as "testicle" - because in Greco-Roman courts, you swore your oath on the family jewels. Women not having testicles, this at least implies that a woman could not give testimony - which is an anti-woman bias in the legal structure. Since you don't have such a thing in the court system without some social impetus, the natural conclusion is that the society regarded women as "less" than men - meaning that men ran things.) Of course, this is far from relevant to XML, so I'll shut up about that now. <g> Perhaps this will get back to the thread at hand - has anyone yet figured out a decent way to attach accurate PICS ratings (esp. RSACi) to dynamic documents? I've got a hack going right now that prevents subordinate data (random ads) from conflicting with the rating assigned to the primary data (the article on which the ad spot appears) - but that requires the use of an eight-field SQL query to select a conforming set of eligible ads (each of which is labeled with a minimum and maximum rating), and a random record is chosen from that set. While this works, it is somewhat less than elegant.... Rev. Robert L. Hood | http://rev-bob.gotc.com/ Get Off The Cross! | http://www.gotc.com/ Download NeoPlanet at http://www.neoplanet.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Thu Dec 2 20:07:24 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:18:14 2004 Subject: How do embed carriage return/new line into the data? Message-ID: <33D189919E89D311814C00805F1991F7F4A959@RED-MSG-08> Regarding putting RDF features into XML, David Megginson wrote: > I disagree strongly with the last part of that statement. I'd argue > the opposite -- higher-level layers should be as independent of XML as > possible. That's the only way to build good, layered architectures. > XML does one thing (represent a tree structure in a character stream) > very well: it's an excellent layer to build other layers on top of, > but XML itself should stay as simple as possible so that it's > applicable widely to many different fields. > > > I think that if the work being done on RDF were refocused to making > > sure that XML Schemas do everything that the RDF advocates are > > rightly claiming is necessary, that we will see a clear win in terms > > of pushing the whole XML effort from a theoretical effort into a > > major paradigm shift with extensive real-world implications. > > That would be another serious mistake. Object exchange, while > important, represents only one of many layers that can be build on top > of XML, and if XML Schemas start trying to solve high-level problems > for every specific domain, it will become an unimplementable mess. > RDF already made a similar mistake by mixing together a spec for > object encoding in XML with a spec for representing knowledge about > Web pages. I agree with David on every one of the points he makes above. I am at least as keen as the next person to use XML for transferring structured data often originated or consumed by object systems, but it would be bad design to make this the only use of XML schemas. Best wishes, Andrew Layman xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Thu Dec 2 20:16:27 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca> Message-ID: <Pine.GSO.3.96.991202140219.20104C-100000@grind> WRT (especially): > in stuff that is not metadata, ordering matters. The converse is > true; if ordering matters, it's probably not metadata. I don't think I agree, and it's not at all hard to find exceptions, if I understand the question. I think the distinction is indeed POV (point of view), and in some cases, as simple as "view" (projection). Imagine an entire book, encoded character by character, from beginning to end. Which characters are "metadata" but not "data"? Any? The book subunits (parts, chapters, sections, subsections) have titles, which like the volume title, may be regarded as "metadata" for the respective units, but they are also "data." For some purposes (an analytical bibliographer), not only "order" is significant - so are many other matters of spatial geometry with respect to the "characters" (and other non-character properties); for other analysts (e.g., enumerative bibliography, descriptive cataloging), the "order" of some character strings (in relation to others) is unimportant. The distinction is rather like "content" (vs.) "not-content" -- fairly bogus, distracting, and confusing -- not to mention problematic because it lies the base of some bad markup language designs. My 2 cents. -robin ------------------------------------------------------------------ On Thu, 2 Dec 1999, Tim Bray wrote: > At 01:14 PM 12/2/99 -0500, David Megginson wrote: > >The second example is an interesting choice. After all, the full OED > >would probably count as metadata to people who bother to make the > >distinction: > > These are murky waters. But there are a couple of things that are > incontrovertably true: > > 1. All metadata is data. Given an aggregation of data items, each > application can and will make its own decisions as to which is "data" > and which "meta". Thus a common syntax for both, to the extent > possible, is a good thing. > 2. Not all data is metadata. Examples: this email message; Chopin's > Nocturnes; Tuxedo.gif. > > Operationally, my experience suggests that in stuff that is > not metadata, ordering matters. The converse is true; if ordering matters, > it's probably not metadata. There are exceptions but you have to > work pretty hard. -Tim > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Dec 2 20:50:04 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <3854c62c.84062872@90.0.0.40> Message-ID: <Pine.LNX.4.10.9912020350400.15285-100000@cauchy.clarkevans.com> On Thu, 2 Dec 1999, Steve Schafer wrote: > It's all a matter of context, no? There exist an infinite number of > levels, each one "meta" to the one immediately below it. Yes! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Dec 2 21:06:13 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <3854c62c.84062872@90.0.0.40> Message-ID: <Pine.LNX.4.10.9912020355570.15285-100000@cauchy.clarkevans.com> On Thu, 2 Dec 1999, Steve Schafer wrote: > It's all a matter of context, no? There exist an infinite number of > levels, each one "meta" to the one immediately below it. I believe that it is a binary recursive pattern: meta-data ... / meta-data / \\ meta-data data ... / \\ / data ... context \\ meta-data ... \\ / data meta-data ... \\ / data \\ data ... Thus, you are right on about it being a "matter of context". Hope this perspective helps, Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Dec 2 21:20:02 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:14 2004 Subject: Content or Metadata? In-Reply-To: <Pine.GSO.3.96.991202140219.20104C-100000@grind> Message-ID: <Pine.LNX.4.10.9912020407571.15285-100000@cauchy.clarkevans.com> On Thu, 2 Dec 1999, Robin Cover wrote: > The distinction is rather like "content" (vs.) "not-content" -- > fairly bogus, distracting, and confusing -- not to mention > problematic because it lies the base of some bad markup > language designs. It's confusing and problematic when the context of the document is not taken into consideration or when the document is used in more than one context without the proper (isomorphic) transformations to preserve meaning. Furthermore, to add insult to injury, there is no such thing as context independence... Perhaps explicit user perspectives / use cases are needed when doing document modeling. Mabye inserting a few transformation steps between contexts would help? ;) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 21:29:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:15 2004 Subject: Request for Discussion: SAX 1.0 in C++ Message-ID: <14406.58446.675568.388482@localhost.localdomain> I think that there is a growing need for a common C++ SAX 1.0 interface as XML moves more and more into high-performance environments. I have kept pointers that people sent to quite a few existing attempts, but before I look those over, I'd like to try my own off the top of my head. I'll be posting three follow-up messages on SAX/C++ to stimulate discussion: 1. Some C++-specific SAX design principles. 2. Implementation changes required or possible in C++. 3. My first stab at a core SAX 1.0 C++ interface. I know that SAX2 is still being neglected, and I apologize. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 21:33:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:15 2004 Subject: SAX/C++: C++-specific design principles Message-ID: <14406.58740.871829.541816@localhost.localdomain> Here are the principles that I applied to creating my first draft SAX/C++ interface: 1. Use references when there can never be a null value, pointers otherwise. 2. Pointers never change ownership -- if a Parser (for example) wants to own an InputSource, it needs to make its own copy. The app has to free everything that it allocates, and the SAX driver, likewise. 3. Callbacks cannot be const, since they often change the state of the client app. 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility with most existing C++ code. 5. Use char * rather than string, to avoid forcing a lot of allocation overhead on the SAX driver. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robertl1 at home.com Thu Dec 2 21:37:46 1999 From: robertl1 at home.com (Robert La Quey) Date: Mon Jun 7 17:18:15 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: <38466487.328D1CFA@praxis.cz> References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> Message-ID: <3.0.6.32.19991202133035.04315e60@mail.dt1.sdca.home.com> At 01:22 PM 12/2/99 +0100, you wrote: >David Megginson wrote: >> If you have a function loadXML(), you get a DOM tree or a bunch of SAX >> events or something similar; if you have a function loadRDF(), you get >> a collection of objects with attributes and relationships. In either >> case, a schema can tell you things like "element type/class B is a >> kind of element type/class A", but that's secondary information; the >> primary information is "element X is an object of class Y with >> identifier Z, while element A represents a relationship between this >> object and object C". > >A schema gives you this information too. The problem of how to attach a >schema to an instance is not yet resolved, but it is a purely syntactic >consideration and a satisfactory solution will be found. This then tells >you what class a given instance belongs too. The identity can be >specified using an ID attribute; this is exactly the way it is done in >RDF. That an element represents a relationship is implicit in the >content model of the element. > >SAX is great as far as it goes, but we seem to be agreeing that an >additional layer is needed on top. This layer is not the DOM. One of the >lessons that I learned from my time at POET Software is that, although >we had an excellent generic API, the vast majority of our customers >wanted to work with real C++ (and later Java) classes in their problem >domain. But there is nothing to say that a loadXML() function must >return a DOM tree. There are a variety of efforts to create >domain-specific objects automatically from XML objects. I don't have a >list at the tips of my fingers, but if anyone does it would be a great >resource. They are out there because I keep bumping into them. > >> If you're interested in a collection of objects in the first place, >> why should you have to see or know about XML elements and attributes >> at all? Or to put it a different way, why should people constantly >> have to redo the work of extracting objects from XML, when they're all >> trying to do the same thing? > >Once again, there are already tools that provide this functionality >across applications (i.e. they can be plugged in and used without >additional development). The interest of XML is essentially as a way to >serialize objects and send them across a network, as you also stated. > >> I think that reasonable people can argue that RDF is not the best >> solution to the problem of object exchange in XML, but I am somewhat >> surprised to hear people deny that the problem even exists: there is >> an enormous demand for exchanging objects in XML (businesses exchange >> a lot of structured data), and it's hard work to have to figure out >> over and over how to construct objects from a SAX stream or a DOM tree >> especially when programmers with XML knowledge are scarce and >> expensive. >> >> I have no doubt that we need an abstract object layer on top of XML. >> Right now, RDF is the best solution currently available (XMI also has >> its advocates), but I'm ready to listen about anything better. > >In no way do I doubt the importance of being able to exchange objects in >XML, but I do have serious reservations about RDF as the way to do this, >and they have nothing to do with the hairy syntax or hard-to-understand >spec. What is lacking right now is an overarching approach to using XML >in real-world applications ... uhh guys, the thread on Web Architecture, to which essentially no one replied, was addressed to exactly these issues. Oh well, it is good to see the issues raised ... A small rewrite to fit this thread. <synopsis> Layer Purpose Example/Description 3) application e.g. [PICS], [OCS], [RSS] 2a) Resource Description Framework Dublin Core Describes a particular choice of data structures (property lists) to be used by applications 2b) Other Application Oriented Data Structures (or objects) 2) Object Definition Standard way to represent objects in ML 1) ML ML used for data serialization and transport and IDL </synopsis> I left out namespaces for the moment. The basic problem remains a lack of a clearly articulated vision of what the web of the future could/should be. Bob La Quey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 21:39:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:15 2004 Subject: SAX/C++: Changes for C++ Message-ID: <14406.59075.218048.437305@localhost.localdomain> Here are some of the differences between the SAX/Java interfaces and the SAX/C++ interfaces: - lots of const - C++ const char * for Java String throughout (and, thus, UTF-8 instead of UTF-16) - InputSource doesn't have an equivalent of Java Reader (no getReader method) - SAXException does not allow an embedded exception, because there's no need to tunnel exceptions in C++ (you can always throw any exception) - DocumentHandler::characters and DocumentHandler::ignorableWhitespace don't need the 'start' argument, since they can be passed a pointer to the start position in an existing array (that's not possible in Java) - HandlerBase omitted, since the classes can contain their own default implementations - I haven't figured out what to do with Parser::setLocale yet All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 21:41:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:15 2004 Subject: SAX/C++: First interface draft Message-ID: <14406.59198.949047.2487@localhost.localdomain> I have just drafted this interface, and haven't even run it through a C++ compiler yet. For clarity, I've omitted constructors and destructors, as well as most of what will be inline implementations. Notes: I haven't looked at other C++ efforts yet, but I will try to do so now. Eventually, this should be in a special C++ namespace. sax.h ====================8<====================8<==================== #ifndef __SAX_HXX #define __SAX_HXX #include <istream> class InputSource { public: virtual const char * getPublicId (void) const; virtual void setPublicId (const char * publicId); virtual const char * getSystemId (void) const; virtual void setSystemId (const char * systemId); virtual std::istream * getInputStream (void) const; virtual void setInputStream (std::istream * in); protected: const char * _publicId; const char * _systemId; std::istream * _in; }; class AttributeList { public: virtual size_t getLength (void) const = 0; virtual const char * getName (size_t pos) const = 0; virtual const char * getType (size_t pos) const = 0; virtual const char * getValue (size_t pos) const = 0; virtual const char * getType (const char * name) const; virtual const char * getValue (const char * name) const; }; class SAXException { public: virtual const char * getMessage (void) const; protected: const char * _message; }; class SAXParseException : public SAXException { public: virtual const char * getPublicId (void) const; virtual const char * getSystemId (void) const; virtual const size_t getLineNumber (void) const; virtual const size_t getColumnNumber (void) const; protected: const char * _publicId; const char * _systemId; const size_t _lineNumber; const size_t _columnNumber; }; class EntityResolver { public: virtual const InputSource * resolveEntity (const char * publicId, const char * systemId); }; class DTDHandler { public: virtual void notationDecl (const char * name, const char * publicId, const char * systemId) {} virtual void unparsedEntityDecl (const char * name, const char * publicId, const char * systemId, const char * notationName) {} }; class DocumentHandler { public: virtual void setDocumentLocator (const Locator &locator); virtual void startDocument (void) {} virtual void endDocument (void) {} virtual void startElement (const char * name, const AttributeList &atts) {} virtual void endElement (const char * name) {} virtual void characters (const char * ch, size_t length) {} virtual void ignorableWhitespace (const char * ch, size_t length) {} virtual void processingInstruction (const char * target, const char * data) {} protected: Locator * _locator; }; class ErrorHandler { public: virtual void warning (const SAXParseException &e) {} virtual void error (const SAXParseException &e) {} virtual void fatalError (const SAXParseException &e) {} }; class Parser { public: // setLocale?? virtual void setEntityResolver (EntityResolver &resolver); virtual void setDTDHandler (DTDHandler &handler); virtual void setDocumentHandler (DocumentHandler &handler); virtual void setErrorHandler (ErrorHandler &handler); virtual void parse (const char * systemId); virtual void parse (const InputSource &input) = 0; protected: EntityResolver * _resolver; DTDHandler * _dtdHandler; DocumentHandler * _documentHandler; ErrorHandler * _errorHandler; }; #endif ====================8<====================8<==================== Comments? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LWatanab at JetForm.com Thu Dec 2 21:56:03 1999 From: LWatanab at JetForm.com (Larry Watanabe) Date: Mon Jun 7 17:18:15 2004 Subject: SAX/C++: First interface draft Message-ID: <111CF63B7D2ED211830000805F65A2FF0180496C@OTTMAIL2> I would suggest making the String class external, with a well-defined minimal interface. Then the user could implement the interface in their own string classes, typedef the String (or DOMString or whatever) as their class, and compile it together. This would allow an application to use its own String implementation, which would save the trouble of a lot of conversions. > -----Original Message----- > From: David Megginson [SMTP:david@megginson.com] > Sent: Thursday, December 02, 1999 4:40 PM > To: XMLDev list > Subject: SAX/C++: First interface draft > > I have just drafted this interface, and haven't even run it through a > C++ compiler yet. For clarity, I've omitted constructors and > destructors, as well as most of what will be inline implementations. > > Notes: I haven't looked at other C++ efforts yet, but I will try to do > so now. Eventually, this should be in a special C++ namespace. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 2 21:59:40 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:15 2004 Subject: A processing instruction for robots Message-ID: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> HTML has a robots meta tag. XML has no standard way to declare the same information. Here is a proposal, with an implementation: http://homepages.go.com/~wunder0/robots-pi.html Comments are welcome. This is also posted to the robots list. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Dec 2 22:11:18 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:15 2004 Subject: Content or Metadata? Message-ID: <3.0.32.19991202140921.01490100@pop.intergate.ca> At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote: >What about the list of authors of a scholarly paper? Isn't that metadata for which order >matters? Yep, in fact that's the one use-case that kept coming up during the early stage of RDF design. Here's another one for free: content models. But the notion that there is some ordering on a document's author, title, and date-of-publication is surprising and unnatural. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Dec 2 22:11:15 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:15 2004 Subject: Request for Discussion: SAX 1.0 in C++ Message-ID: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> At 04:27 PM 12/2/99 -0500, David Megginson wrote: >I'll be posting three follow-up messages on SAX/C++ to stimulate >discussion: Good idea, one question. Any way to do C at the same time? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Dec 2 22:12:29 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:18:15 2004 Subject: Any XML Schemas validators out yet ? Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2781B7@MDYNYCMSX1> >So Im wondering if there are any tools that can validate XML Schemas >themselfs >and maybe also validate XML documents using XML Schemas ? At least for W3C Schemas: Being XML documents themselves, you can take the DTD in Appendix B of the schema proposal and validate your schema against that using any validating parser. To validate XML documents against these schemas, the only thing I know of out there is the Xerces parser at xml.apache.org. Bob DuCharme www.snee.com/bob <bob@ snee.com> see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Thu Dec 2 22:33:29 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:18:15 2004 Subject: Content or Metadata? In-Reply-To: <3.0.32.19991202140921.01490100@pop.intergate.ca> Message-ID: <Pine.GSO.3.96.991202162508.20104E-100000@grind> > the notion that there is some ordering on a document's author, title, > and date-of-publication is surprising and unnatural. -T. Depends... He said "list of authors." As in, multiple authors, where the principal author is listed first (regardless of the spelling of surname and Western-style collation sequence), the "next-most- principal-author" is listed second in the order(-ed, -able) author list, reflecting the contract... blah blah. Of course, such notions reflect perspective, which may or may not be implicit/explicit in the style rules and underlying assumptions of the house. Hence: "views, perspectives, projections, purposes." No one of them is fixed. The poem escapes the intent of the author, and becomes the property of the collective consciousness of the community. -rcc ----------------------------------------------------------------- On Thu, 2 Dec 1999, Tim Bray wrote: > At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote: > >What about the list of authors of a scholarly paper? Isn't that metadata for which order > >matters? > > Yep, in fact that's the one use-case that kept coming up during the early > stage of RDF design. Here's another one for free: content models. But > the notion that there is some ordering on a document's author, title, > and date-of-publication is surprising and unnatural. -T. > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwtodd at pacbell.net Thu Dec 2 23:01:30 1999 From: jwtodd at pacbell.net (James Todd) Date: Mon Jun 7 17:18:15 2004 Subject: i'd like to merge two docs ... References: <3845EAFA.341C3122@pacbell.net> <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com> <38461554.61917EBD@pacbell.net> <000d01bf3c93$6febd9d0$3b652382@us.oracle.com> <3846F703.DF8CED3@pacbell.net> Message-ID: <3846FC10.D1B6E795@pacbell.net> quick recap: ahhh ... i just figured it out. with ProjectX there is a com.sun.xml.tree.DocumentEx.changeNodeOwner(Node) that does the trick. so, if i exchange the removeChild() call with a changeNodeOwner() call i can quite readily rehost a doc fragment. any ideas as how to do this, if possible, with a standard dom api? thx, - james James Todd wrote: > > | > > | Steve Muench wrote: > > | > > | > Assuming you have XML DOM Documents "one" and "two" > > | > and that "oneElement" is the element in doc "one" > > | > to which you'd like to append the entire content > > | > of "two"... > > | > > > | > You should be able to do: > > | > > > | > Element twoDocElt = two.getDocumentElement(); > > | > two.removeChild(twoDocElt); > > | > oneElement.appendChild(twoDocElt); > > | > > > | > _________________________________________________________ > > | > Steve Muench, Consulting Product Manager & XML Evangelist > > | > Business Components for Java Development Team > > | > http://technet.oracle.com/tech/java > > | > http://technet.oracle.com/tech/xml > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Thu Dec 2 23:02:14 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:18:16 2004 Subject: Content or Metadata? In-Reply-To: <3.0.32.19991202140921.01490100@pop.intergate.ca> Message-ID: <Pine.GSO.3.96.991202164214.20104G-100000@grind> Postscriptum: In some OO theory, I think it's believed favorable to create distinct attributes for things that are ordered (since [Boyce-] Codd believed that attributes are intrinsically unordered): this, for Mike Spreitzer's example of "list of authors": firstAuthor, secondAuthor, thirdAuthor, etc. Well, suppose there are in fact three groups of authors, with different principles of sub-ordering, which are masked in the typical presentation... it may then be more economical (80:20 rule, which I detest) to say that we allow an attribute value which is an orderable list of (sub-)tokens. I have seen -- indeed, documented -- some works which enumerated over 30 authors for the piece. Volumes/analytical works from the French academies. (And why not? Only the aesthetics of print books and the supposed cost of printer's ink have lead to style rules that say "truncate with 'etc' after N authors...".) In such cases: I suspect the order (-edness, -ability) has nothing to do with whether the factoids are (meta-)data or not. Nothing is simple, despite what could appear to be incontrovertible facts. -r -------------------------------------------------------------- On Thu, 2 Dec 1999, Tim Bray wrote: > At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote: > >What about the list of authors of a scholarly paper? Isn't that metadata for which order > >matters? > > Yep, in fact that's the one use-case that kept coming up during the early > stage of RDF design. Here's another one for free: content models. But > the notion that there is some ordering on a document's author, title, > and date-of-publication is surprising and unnatural. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 2 23:50:17 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:16 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> Message-ID: <14407.1389.659881.147338@localhost.localdomain> Tim Bray writes: > At 04:27 PM 12/2/99 -0500, David Megginson wrote: > >I'll be posting three follow-up messages on SAX/C++ to stimulate > >discussion: > > Good idea, one question. Any way to do C at the same time? -Tim Sure -- is there a strong need for a common C interface, though? We already have Expat's C interface, and I don't know of anyone else in that space yet. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Fri Dec 3 00:31:29 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:18:16 2004 Subject: Any XML Schemas validators out yet ? References: <01BA10F0CD20D3119B2400805FD40F9F2781B7@MDYNYCMSX1> Message-ID: <3846F6CB.86276410@toolsmiths.se> "DuCharme, Robert" wrote: > >So Im wondering if there are any tools that can validate XML Schemas > >themselfs > >and maybe also validate XML documents using XML Schemas ? > > At least for W3C Schemas: > > Being XML documents themselves, you can take the DTD in Appendix B of the > schema proposal and validate your schema against that using any validating > parser. I tried this but MS Explorer 5.0.2919 reports this error in the XML Schema proposal: Attribute 'xmlns:' must be a #FIXED attribute. Line 17, Position 18 model (open|refinable|closed) 'closed' > -----------------^ Maybe Im using the wrong Schema , "http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.dtd" ? > > To validate XML documents against these schemas, the only thing I know of > out there is the Xerces parser at xml.apache.org. Thanks, Ill have a look. Best /Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Dec 3 01:19:56 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:16 2004 Subject: Content or Metadata? References: <3.0.32.19991202111153.0150e870@pop.intergate.ca> Message-ID: <011f01bf3d15$bbac9300$eb020a0a@bowstreet.com> => 2. Not all data is metadata. Examples: this email message; Chopin's > Nocturnes; Tuxedo.gif. While, I'm not arguing against your point, it is interesting to note that each of these examples have headers which could be thought of as metadata. James Tauber "Metadata is data you forgot to put in in the first place" - Ted Nelson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Dec 3 02:51:38 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:16 2004 Subject: Content or Metadata? Message-ID: <006e01bf3d3c$d46a6da0$5ef96d8c@NT.JELLIFFE.COM.AU> From: Robin Cover <robin@isogen.com> >Of course, such notions reflect perspective, which may or may >not be implicit/explicit in the style rules and underlying >assumptions of the house. For all its sins, RDF showed up a major area that is currently missing in Schemas: the need to make the generic relationships between elements explicit. In particular RDF used "bag", "seq" and "alt". But there are many more such relationships: * is one element an annotation of another? * is that annotation superior (e.g. a title, a summary) or subsidiary (e.g., an explaination, a digression, an alternative, a role)? * does one element/attribute have any meaning without some other element/attribute (e.g., does a particular number also require a units element/attribute/default)? * which roles do elements and atributes play in the particular taxonomic/ontological methodology of their creator (e.g., what is data, what is metadata)? Some of these things, RDF Schemas could make possible, and XLink could have made possible. I think RDF is a continual reminder that GIs and containment may make relationships obvious to humans, but in the absense of other conventions, they may hide these relationships from the computer. B.t.w, the sins of RDF were all commented on at the time: * the spec is clearly two or three different different documents cobbled together with little cohesion between them; * having a syntax like the _n attribute names which made validation impossible except by special-purpose validators; * not having the discipline of a DTD fragment, so that some elements mentioned are never explicitly given in the EBNF productions; * RDF is a framework but it should have been an architecture which is framework-neutral. The test of whether it is useful as a framework are whether generic tools are useful for RDF data; if, in fact, it is being mainly used for specific applications, then RDF markup would be better formulated as conventions that sit on top of DTDs/schemas that allow as natural modeling of the data as possible. I think RDF should have concentrated on how to fit on top of regular markup, including markup of inline elements interspersed through paragraphs. Atomic data is just the simplest case of that. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 3 03:08:11 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:16 2004 Subject: Content or Metadata? In-Reply-To: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com> Message-ID: <001201bf3d3b$9c469760$099918d1@docuverse1> 'meta-' just means 'beyond' or 'transcending' and requires a context. Engineers typically apply the 'instance' role to the context and 'definition' role to the 'meta-whatever' because 'Category' is a powerful meme. There are other memes that retains the 'meta' relationship between roles. Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Fri Dec 3 03:25:56 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:16 2004 Subject: Some questions References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com> Message-ID: <003901bf3d3e$b4fe61e0$a82a08d1@tomshp> Walter Underwood wrote: >... > Here are stages of having the info (content) that you want, > ordered in increasing amounts of wasted time. > > 1. I have the information. > 2. I know where the information is. > 3. I know it exists, but I don't know where it is. > 4. I don't know if it exists. > > Only the last two need some sort of metacontent or finding aid. > I'd add one more: 5. I'm not sure exactly what I'm looking for, but I'll probably know it when I find it. This could be analogous to browsing in store looking for a gift, which you vaguely thought might be a toaster, and discovering a bread machine. With the size and complexity of the web, making (5) work better would be a great boon. Tom Passin > Organizing and indexing content is a time-saver, and sometimes that > is essential. Sometimes, the metacontent has the whole answer (which > companies sell rhinestone tiaras), but most people really want to > buy the tiara. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Rajiv.Mordani at eng.sun.com Fri Dec 3 03:25:25 1999 From: Rajiv.Mordani at eng.sun.com (Rajiv Mordani) Date: Mon Jun 7 17:18:16 2004 Subject: q: i'd like to merge two docs ... In-Reply-To: <3845EAFA.341C3122@pacbell.net> Message-ID: <Pine.SOL.3.96.991202192213.6718E-100000@milhouse> You have the changeNodeOwner API in XmlDocument to actually do the necessary so you don't get the error shown below. So before appending use the changeNodeOwner and then you are all set. - Rajiv XML is to the 90s what ASCII was to the 70s On Wed, 1 Dec 1999, James Todd wrote: > > hi - > > i could use a pointer or two, a recipe if you will, on how best to > "modify and merge" two xml docs. the scenario: > > an inbound xml "fragment", a complete xml doc in it's own > right, is amended (eg. one new attribute is added) > > the results of which is appended, as a child node, to a > "hosting" xml tree > > i've got most of this working using the ProjectX [? Mr. Brownell ?] > parser yet it fails during the appendChild() stating that the child > node > > "That node doesn't belong in this document" > > due to the fact, i believe, that it has a distinct OwnerDocument. > > my methodology to date is to create dom's for both the inbound > "fragment" and the destination xml docs afterwhich i'd like to > modify > the fragment (hence going the dom route) and finally add the results > > to the destination doc via appendChild(). > > i had hoped to bypass walking the tree in order to create an > "document ownerless" copy with which to work with. is there > a better/preferred means by which to accomplish this task? > > any/all comments and suggestions welcomed. > > thx much, > > - james > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Fri Dec 3 03:48:36 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:16 2004 Subject: Request for Discussion: SAX 1.0 in C++ References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> Message-ID: <008301bf3d41$de0c9b80$a82a08d1@tomshp> David Megginson wrote > Tim Bray writes: > > > At 04:27 PM 12/2/99 -0500, David Megginson wrote: > > >I'll be posting three follow-up messages on SAX/C++ to stimulate > > >discussion: > > > > Good idea, one question. Any way to do C at the same time? -Tim > > Sure -- is there a strong need for a common C interface, though? We > already have Expat's C interface, and I don't know of anyone else in > that space yet. > But C is available on most _any_ platform - often for free. So almost anyone could compile in C but not necessarily in C++. Isn't rxp done in C? Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vlashua at RSGsystems.com Fri Dec 3 04:45:49 1999 From: vlashua at RSGsystems.com (Vane Lashua) Date: Mon Jun 7 17:18:16 2004 Subject: RDF, again Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D72@MAILPROD> Asking out of ignorance: Is there thought being devoted to a universally accessible catalog of id's, names (lists), classes, datatypes -- maybe even using the MARC system and the LC index -- existing as a universal repository of components describing data structures? It would be a "soft" resource, like a library catalog, but with "hard" data points: the LC system is not a standard; it is a registry and a reference maintained by an authority. A publisher may suggest the cataloguing classification of an individual object, but any given library may catalog its instance-object differently. Meanwhile, because publishers and libraries are interested in keeping in touch with information, a library patron from virtually anywhere can find most objects in a given class and select from them. The difficulty with the definitions below, for instance, is that "name" is a collection of characters whose context is not clear without a reference. Namespaces, it seems to me, are absolutely necessary, but they tend to encourage diversity where convergence would be a more enlightened tendency. Vane -----Original Message----- From: Mark Birbeck [mailto:Mark.Birbeck@iedigital.net] Sent: Tuesday, November 23, 1999 6:46 PM To: 'Paul Prescod'; 'xml-dev@ic.ac.uk' Subject: RE: RDF, again Paul Prescod wrote: > The thing I find confusing about the RDF syntax is that the > element type > name can be either an RDF type name or an RDF property. XML makes no > distinction and that's why I think that it is difficult to use for > object oriented interchange. I got the impression from the spec that this is intentional, so that a straightforward XML document - that might not contain *any* RDF - can still be interpreted as a set of RDF statements. In other words, different XML layouts (elements for attributes, e.g.) of the same data would result in the same RDF statements. The XML would still need to be well thought out though. For example: <person name="Paul"> <food>trifle</food> </person> might mean trifle is your favourite food, the main food you're allergic to, or your pudding preference for the office Xmas party. All of these are acceptable in XML, but the RDF interpretation of this may well be incorrect - or at least not as rich in meaning as we would like: Person has a name "Paul" and a food "trifle" So, to make the first statement - trifle is Paul's favourite food - we could use the following RDF: <rdf:RDF> <rdf:Description ID="1"> <rdf:Type rdf:resource="person" /> <x:name>Paul</x:name> </rdf:Description> <rdf:Description ID="2"> <rdf:Type rdf:resource="food" /> <x:name>trifle</x:name> </rdf:Description> <rdf:Description about="#1"> <x:favourite rdf:resource="#2" /> </rdf:Description> </rdf:RDF> Using the abbreviated forms allowed to us, this is the same 'RDF': <x:person x:name="Paul"> <x:favourite> <x:food>trifle</food> </x:favourite> </x:person> or: <x:person> <x:name>Paul</x:name> <x:favourite> <x:food>trifle</x:food> </x:favourite> </x:person> or: <x:person> <x:name>Paul</x:name> <x:favourite x:food="trifle" /> </x:person> So - to turn this round - any of the previous three XML documents can be interpreted as the same set of RDF statements - a person with the name "Paul" has a favourite food, and that food is called trifle - even without any explicit RDF present. As to whether it is any good for object interchange, I think it is. Of course, if the relationships between elements contained within other elements can be inferred then straight XML is fine. But as soon as you need something more complex then RDF is very good (not to mention when the objects being referred to are outside of the XML document you're, and so you can't use ID/IDREF.) Best regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vlashua at RSGsystems.com Fri Dec 3 04:46:33 1999 From: vlashua at RSGsystems.com (Vane Lashua) Date: Mon Jun 7 17:18:16 2004 Subject: INTERFACE {was SGML, XML and SML, ugh!} Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD> There is no rationale for interface as a topic in an XML discussion group, but while it's passing: The newest of the eyeglasses interfaces with the small addition of thumb-ball, earphones, and mic, is getting near to "better-than-TRS-80". The most significant impediment to a good interface is the querty keyboard and our collective investment in having learned to use it (combined with the need for relative silence while we're using it). Around the same era that the mouse emerged, there was on the market a single-handed(?) encoding device whose speed was about the same as querty. I think I saw it in Byte. Anybody seen one lately? Vane -----Original Message----- From: Tyler Baker [mailto:tyler@infinet.com] Sent: Monday, November 22, 1999 11:13 PM To: rev-bob@gotc.com Cc: xml-dev@ic.ac.uk Subject: Re: SGML, XML and SML rev-bob@gotc.com wrote: > > ** Original Sender: David Megginson <david@megginson.com> > > <snip!> > really seems to be looking at - text-to-speech conversion for small devices. That is, > instead of working on making the tiny screens a little bit bigger or a little bit clearer, <snip!> no one has created a display device or user interface that is significantly better than the old TRS-80. <snip!> Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 04:49:44 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:16 2004 Subject: SAX/C++: UTF-8 v UTF-16 References: <14406.58740.871829.541816@localhost.localdomain> Message-ID: <38472FE3.D3BB22BC@jclark.com> David Megginson wrote: > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility > with most existing C++ code. I would say there was at least as much C++ code using UTF-16 as using UTF-8. On Windows at least, UTF-16 is much more common. The DOM mandates UTF-16, so if SAX mandated UTF-8 there would be an unfortunate mismatch. This is a tough one, because there's a lot more diversity in the C++ world. My preference would be not to mandate either UTF-8 or UTF-16 exclusively. There are lots of apps using UTF-8 and there are lots of apps using UTF-16; if you exclude either, then a lot of apps will take a mojor performance/convenience hit. Expat allows a choice at compile-time between UTF-8 and UTF-16, and there are big projects using both (eg Perl uses UTF-8 and Mozilla uses UTF-16). There are a couple of possible solutions: 1. A lo-tech solution. Provide a SAXChar typedef, and define everything in terms of SAXChar. SAXChar gets typedefed to either char or unsigned short depending on whether SAX_UNICODE is defined or not. It's up to implementations to decide whether to support both or just one, and up to clients to decide whether to work with both or to require one. A variation on this is to allow both UTF-8 and UTF-16 variants to exist in a single library. To do this, you can do something along the lines of class AttributeList16 { public: virtual const unsigned short *getName(int pos) = 0; }; class AttributeList8 { public: virtual const char *getName(int pos) = 0; }; #ifdef SAX_UNICODE typedef AttributeList16 AttributeList; #else typedef AttributeList8 AttributeList; #endif 2. A hi-tech solution. Do what the Standard C++ library does and make the interface a template in the character type. This is the cleanest solution, but lots of C++ projects eschew templates on portability grounds. If you feel that one needs to be mandated, I would pick UTF-16. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 04:49:46 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:16 2004 Subject: SAX/C++: C++-specific design principles References: <14406.58740.871829.541816@localhost.localdomain> Message-ID: <384741C0.50ABA536@jclark.com> David Megginson wrote: > 2. Pointers never change ownership -- if a Parser (for example) wants > to own an InputSource, it needs to make its own copy. The app has > to free everything that it allocates, and the SAX driver, likewise. That's problematic for EntityResolve::resolveEntity; that requires that ownership of an InputSource be transferred from to the caller from the callee. This could be avoided by doing: virtual const InputSource * resolveEntity(const char *publicId, const char *systemId); instead of: virtual void resolveEntity(const char *publicId, const char *systemId, InputSource &inputSource); James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 04:49:49 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:16 2004 Subject: SAX/C++: First interface draft References: <14406.59198.949047.2487@localhost.localdomain> Message-ID: <38474BAF.AF4CFF2D@jclark.com> In Java, everything in SAX is an interface. The way to do an interface in C++ is to use a class where all members (except possibly a virtual destructor) are abstract (ie defined as = 0). This provides the maximum flexibility and insulation. The only good reason not to do an interface is if it were necessary and possible to inline some method calls for performance. I think this this applies here: certainly there's no performance need to inline method calls to something like InputSource. One interesting issue is whether to provide a virtual destructor. I think the safest solution is not to provide a virtual destructor but instead to declare but not define a private operator delete. This makes it a compile time error to do: DTDHandler *p; // ... delete p; Given the policy on object ownership there's never any need to do that: only the creator of an object can delete it and the creator always has a pointer to the concrete subclass which will provide a way to release the object. It also has the nice property that there is no .cpp file associated with the SAX interface and no SAX library that has to be compiled or linked with. It would be a completely pure interface. Here's another draft, with this change and a few other minor changes; - use int not size_t (Lakos has a whole section on why unsigned in interfaces is usually a bad idea) - use a SAXString typedef for zero-terminated arrays - don't use (void) for empty argument lists - use iosfwd not istream as the header file - use characters not SAXCharacters as the method name on DocumentHandler - use a const char * arg for Parser::setLocale; I think that's the best you can do portably; Standard C++ allows locales to be identifier by name - add Locator - change resolveEntity to avoid transfer of ownership as suggested in my previous message - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8 and a SAX_UTF16 namespace (since you're using std::istream, you are assuming compiler support for namespaces); this will work nicely with namespace aliases (eg namespace SAX = SAX_UTF8). Discussion points: - Would it be better to typedef SAXString to the Standard C++ string class (ie std::basic_string<SAXChar>)? James Here's SAX.h: #ifndef __SAX_HXX #define __SAX_HXX // Forward declarations of std::istream #include <iosfwd> namespace SAX_UTF8 { typedef char SAXChar; // A 0 terminated array of SAXChars. typedef const char *SAXString; #include "SAXDecl.h" } namespace SAX_UTF16 { typedef unsigned short SAXChar; // A 0 terminated array of SAXChars. typedef const unsigned short *SAXString; #include "SAXDecl.h" } #endif And here's SAXDecl.h: class InputSource { public: virtual SAXString getPublicId () const = 0; virtual void setPublicId (SAXString publicId) = 0; virtual SAXString getSystemId () const = 0; virtual void setSystemId (SAXString systemId) = 0; virtual std::istream * getInputStream () const = 0; virtual void setInputStream (std::istream * in) = 0; private: void operator delete (void *); }; class AttributeList { public: virtual int getLength () const = 0; virtual SAXString getName (int pos) const = 0; virtual SAXString getType (int pos) const = 0; virtual SAXString getValue (int pos) const = 0; virtual SAXString getType (SAXString name) const = 0; virtual SAXString getValue (SAXString name) const = 0; private: void operator delete (void *); }; class SAXException { public: virtual SAXString getMessage () const = 0; private: void operator delete (void *); }; class SAXParseException : public SAXException { public: virtual SAXString getPublicId () const = 0; virtual SAXString getSystemId () const = 0; virtual int getLineNumber () const = 0; virtual int getColumnNumber () const = 0; private: void operator delete (void *); }; class EntityResolver { public: virtual void resolveEntity (SAXString publicId, SAXString systemId, InputSource &) = 0; private: void operator delete (void *); }; class DTDHandler { public: virtual void notationDecl (SAXString name, SAXString publicId, SAXString systemId) = 0; virtual void unparsedEntityDecl (SAXString name, SAXString publicId, SAXString systemId, SAXString notationName) = 0; private: void operator delete (void *); }; class Locator { public: virtual SAXString getPublicId () const = 0; virtual SAXString getSystemId () const = 0; virtual int getLineNumber() const = 0; virtual int getColumnNumber() const = 0; private: void operator delete (void *); }; class DocumentHandler { public: virtual void setDocumentLocator (const Locator &locator) = 0; virtual void startDocument () = 0; virtual void endDocument () = 0; virtual void startElement (SAXString name, const AttributeList &atts) = 0; virtual void endElement (SAXString name) = 0; virtual void characters (const SAXChar * ch, int length) = 0; virtual void ignorableWhitespace (const SAXChar * ch, int length) = 0; virtual void processingInstruction (SAXString target, SAXString data) = 0; private: void operator delete (void *); }; class ErrorHandler { public: virtual void warning (const SAXParseException &e) = 0; virtual void error (const SAXParseException &e) = 0; virtual void fatalError (const SAXParseException &e) = 0; private: void operator delete (void *); }; class Parser { public: virtual void setLocale (const char *) = 0; virtual void setEntityResolver (EntityResolver &resolver) = 0; virtual void setDTDHandler (DTDHandler &handler) = 0; virtual void setDocumentHandler (DocumentHandler &handler) = 0; virtual void setErrorHandler (ErrorHandler &handler) = 0; virtual void parse (SAXString systemId) = 0; virtual void parse (const InputSource &input) = 0; private: void operator delete (void *); }; This also extends easily to doing a templated version: template<class SAXChar, class SAXString> class BASIC_SAX { #include "SAXDecl.h" }; James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at qub.com Fri Dec 3 05:08:13 1999 From: paul at qub.com (Paul Tchistopolskii) Date: Mon Jun 7 17:18:17 2004 Subject: XML. SAX. Streaming processing with Groves. Message-ID: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> The advantage of SAX ( and attributes in XML) is that we have attributes in place when startElement is invoked. We know what are some properties of this element right when element begins. At that point could we get the information about the another properties this element has ( the child elements ) ? No. We can not. If we'l decide to read the entire element before infoking startElement() - we'l have to read the entire ( root ) document to know. DOM does it. How can we workaround this limitation? Right now I'm writing yet another wrapper around SAX, accumulating the element contents in 'microDOM' and then making a descision in endElement() what to do with the element itself depending on the values of his children. The 'correct' approach to avoid such a hell is to use DOM - but I can't. Documents could be big. I also can not require the client to turn all their elements into attributes. It's actualy very interesting. If one wants his XML documents to be easy to process without DOM he should have as much attributes as it's possible ! <aside> Isn't it the end of long discussion of Elements vs Attributes? Now when I see the question: "Should I use attributes or elements?" - I know the answer: "If you want it to be processed by current APIs not keeping the entire docuemnt in the memory - use attributes everywhere you can." </aside> What if for *some* elements parser would invoke startElement ( or endElement) *after* reading the entire element and placing all the children into ... Grove) ? I think I could specify those 'return-as-grove' elements at runtime, or when I'm initializing the parser. How easy is to it do with curent SAX design ? Rgds.Paul. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mike.Champion at softwareag-usa.com Fri Dec 3 06:03:30 1999 From: Mike.Champion at softwareag-usa.com (Michael Champion) Date: Mon Jun 7 17:18:17 2004 Subject: XML. SAX. Streaming processing with Groves. References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> Message-ID: <011101bf3d53$60df73a0$e5d88dce@WORKGROUP> ----- Original Message ----- From: Paul Tchistopolskii <paul@qub.com> To: <xml-dev@ic.ac.uk> Sent: Friday, December 03, 1999 12:05 AM Subject: XML. SAX. Streaming processing with Groves. > The 'correct' approach to avoid such a hell > is to use DOM - but I can't. Documents could > be big. > The DOM WG will be defining the requirements for Level 3 over the next 6 weeks or so. Standard APIs for loading, saving, parsing, and serializing XML text are "must have" items for Level 3, and this issue (that an application may want access to the elements of a document before it is fully parsed) has come up. For example, a programmer might choose not to continue parsing some huge document after the necessary data were found. Concrete suggestions for actual APIs or pointers to APIs that allow this would be appreciated. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 3 06:13:52 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:17 2004 Subject: XML. SAX. Streaming processing with Groves. In-Reply-To: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> Message-ID: <001b01bf3d55$a451c500$099918d1@docuverse1> >The advantage of SAX ( and attributes in XML) >is that we have attributes in place when startElement >is invoked. We know what are some properties of this >element right when element begins. While there are indeed practical benefits to having attributes readily available, event-based APIs like SAX unintentionally encourage novice XML programmers, who are not fully aware of attribute-vs-element issues, toward designing data formats that favoring attributes over child elements. >At that point could we get the information about >the another properties this element has >( the child elements ) ? At the expense of requiring multithread support, most of this problem goes away if the parser runs in a separate thread so that by the time the attribute stored as a child element is requested, it is already available. If not, the requester's thread simply blocks until it is. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From howardk at fatdog.com Fri Dec 3 06:25:14 1999 From: howardk at fatdog.com (Howard Katz) Date: Mon Jun 7 17:18:17 2004 Subject: ANNOUNCE: XML Query Engine References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <384761B5.5CBF5851@fatdog.com> [from the website] XML Query Engine is a full-text search-and-retrieval engine for XML documents. A JavaBean component, XML Query Engine can index single or multiple well-formed XML documents using any SAX-based parser. The query engine builds an in-memory representation of the content and structure of the indexed documents. Users can then pose queries against the indexed data using XQL, a de facto standard for searching XML that is [very nearly] a proper subset of XPath, an official W3C recommendation. The version of XQL used by XML Query Engine has been extended slightly to provide a facility for making full-text queries against the data set. This capability is similar to that found in most current web-based search engines. [end] The software is currently in alpha and I'm interested in getting feedback. I'll be demoing in the "New Technology Nursery" area on the exhibit floor at XML'99 next week. I don't know my schedule for the show yet. If you want to reach me in Philadelphia, call me at the cell number below or leave a message at the Marriott. I'll be shipping copies of the software once I'm back in Vancouver the week of December 13th. I'd like to hear initially what people want to do with it. If you want a copy, send me an email and tell me in one or two lines whether your intentions are honourable and what they are. :-) I'll be happy to send you a zipped copy in return. More information is available at www.fatdog.com. If you're emailing me, please copy me at howardckatz@yahoo.com since I'm experiencing some email difficulties due to a domain-name move. Regards, Howard Katz, Fatdog Software email: howardk@fatdog.com web: www.fatdog.com cell: (604) 725-3434 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mdash at techbooks.com Fri Dec 3 06:30:11 1999 From: mdash at techbooks.com (Manoranjan Dash) Date: Mon Jun 7 17:18:17 2004 Subject: unsubscribe In-Reply-To: <011101bf3d53$60df73a0$e5d88dce@WORKGROUP> References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> Message-ID: <3.0.6.32.19991203115728.008a0100@pinnacle.techbooks.com> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 3 06:30:32 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:17 2004 Subject: INTERFACE {was SGML, XML and SML, ugh!} In-Reply-To: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD> Message-ID: <001d01bf3d57$f8bdca60$099918d1@docuverse1> The other day, I had this idea about solving the display problem for mobile computering. While I do not think it is implementable right now, I do not think it is impossible. The problem: large displays for mobile device. The solution: public display walls that shows different views to to different people simul- taneously. Multithreaded display of sort. <g> Technology wise, I think the pixels will have to protrude like a small pyramid to show multiple views and multiplexed to coincide with the viewer's eyeglass which includes LCD shutters. Interesting stuff to muse about. Not much privacy barrier but I think it might make sense in certain applications. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Fri Dec 3 07:27:41 1999 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:18:17 2004 Subject: ANNOUNCE: XML Query Engine In-Reply-To: <384761B5.5CBF5851@fatdog.com> Message-ID: <Pine.BSI.3.96r.991203021510.2564A-100000@shell1.interlog.com> On Thu, 2 Dec 1999, Howard Katz wrote: > XML Query Engine is a full-text search-and-retrieval engine for > XML documents. A JavaBean > component, XML Query Engine can index single or multiple well-formed > XML documents using any > SAX-based parser. I think this is interesting, but I wonder about the performance. There are two main reasons for using text retrieval, as I see it. (1) for searching a large body of text significantly more quickly than with grep (2) for kinds of search not otherwise possible, such as searches that span words, or that include stemming, synonyms or other morphological and linguistic analysis, or that include document structure or other "fielded" searches. > The query engine builds an in-memory representation of the content and > structure of the indexed documents. This sounds like an interesting proof of concept... but if I am searching, say, five gigabytes of text, what will happen? Indexing speed is also an issue. So I assume the main purpose of this tool is to experiment with XPath/XQL, is that fair?? Lee and yes, I'd like a copy! thanks :-) -- Liam Quin, Barefoot Computing, Toronto; The barefoot agitator l i a m at h o l o w e b dot n e t Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 07:34:28 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:17 2004 Subject: Request for Discussion: SAX 1.0 in C++ References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> Message-ID: <38475BB6.662858CF@jclark.com> "Thomas B. Passin" wrote: > > David Megginson wrote > > Tim Bray writes: > > > > > At 04:27 PM 12/2/99 -0500, David Megginson wrote: > > > >I'll be posting three follow-up messages on SAX/C++ to stimulate > > > >discussion: > > > > > > Good idea, one question. Any way to do C at the same time? -Tim > > > > Sure -- is there a strong need for a common C interface, though? We > > already have Expat's C interface, and I don't know of anyone else in > > that space yet. > > > But C is available on most _any_ platform - often for free. So is C++ these days. > So almost > anyone could compile in C but not necessarily in C++. The bigger problem is that the SAX style of interface goes over quite naturally into C++, but would be rather awkward in C. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 09:09:41 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:17 2004 Subject: Some questions In-Reply-To: David Megginson's message of "30 Oct 1999 06:45:32 -0500" References: <000301bf3c4c$85bde920$0f36a8c0@quokka.com> <m33dutoyw3.fsf@localhost.localdomain> Message-ID: <whogc8tmpg.fsf@viffer.oslo.metis.no> >>>>> David Megginson <david@megginson.com>: > "Jeffrey E. Sussna" <jes@kuantech.com> writes: >> I wouldn't consider RDF at the same level as CORBA, but perhaps part >> of an overall solution. > Though I'm the one that brought CORBA into the discussion, I think > that a better comparison would probably be XMI, since CORBA is a > protocol rather than a format. <pedantic mode> CORBA is a standard (or set of standards), for creating and using distributed objects, and consisting of formats and protocols. IIOP/GIOP would be a protocol, and so would the IDL interfaces of the CORBA Services, if you stretch the concept. IDL is definitely a format, and I don't know how to categorize the different IDL language bindings as either. </pedantic mode> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Dec 3 09:32:56 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:17 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> Message-ID: <38478DB2.FACA4633@praxis.cz> David Megginson wrote: > How does the schema tell me that foo represents a container for a > collection of objects, bar represents an object, and hack and flurb > represent the object's properties? The point is not what the current schema draft allows, it is whether it would be feasible and appropriate to represent this information in XML schemas, as Paul rightly stated. My opinion is that it would be fairly trivial and extremely useful. > It can be. The DOM represents a domain-specific object layer that is > useful for a wide subset of XML operations (especially document- and > browser-oriented work). There need to be many layers on top of XML, > one for each domain -- it happens that many of those layers will share > the need to encode objects, so a standard object layer sandwiched > between XML and the domain-specific layers can save a lot of work. Sure, the DOM has value. My point is that maybe 95% of applications want a domain-specific rather than a generic interface. My other point is that a domain-specific interface can be implemented generically; i.e. programmatic interfaces for accessing XML data can be generated automatically from XML schemas. This isn't *that* far from what MDSAX is doing. IBM's XML BeanMaker (http://alphaworks.ibm.com/tech/xmlbeanmaker) is a good example of this concept. > > There are a variety of efforts to create > > domain-specific objects automatically from XML objects. I don't have a > > list at the tips of my fingers, but if anyone does it would be a great > > resource. They are out there because I keep bumping into them. > > One example is RDF. So we are talking about different things. RDF is a formalism but it doesn't provide you with any code (although I'm sure that tools for this could be written, and perhaps already have been). I am talking about something that will take my schema with Customer and Invoice element types and turn it into, say, Java classes called Customer and Invoice. > I disagree strongly with the last part of that statement. I'd argue > the opposite -- higher-level layers should be as independent of XML as > possible. That's the only way to build good, layered architectures. > XML does one thing (represent a tree structure in a character stream) > very well: it's an excellent layer to build other layers on top of, > but XML itself should stay as simple as possible so that it's > applicable widely to many different fields. I agree with the layering approach. But well-formed XML should be viewed as the lowest level (representing tree structures); when bound to an XML schema it then becomes a serialized object representation. > That would be another serious mistake. Object exchange, while > important, represents only one of many layers that can be build on top > of XML, and if XML Schemas start trying to solve high-level problems > for every specific domain, it will become an unimplementable mess. > RDF already made a similar mistake by mixing together a spec for > object encoding in XML with a spec for representing knowledge about > Web pages. Maybe this is the crux of our disagreement. I see object exchange as *the* application for valid XML. I'd be interested to hear some examples of applications that cannot be cast effectively in this light. In this view, RDF and XML Schemas are coming at the same problem from different angles. RDF is saying essentially "how do we build an XML application that represents object structures", while XML Schemas are saying "how do we enhance DTDs by adding some object-oriented facilities". My fear is that these two approaches are going to meet somewhere in the middle and turn out to be the same thing. If so, I vastly prefer the use of XML schemas. Why? Because this results in a vast simplication of the whole XML picture. Isn't it better to take a normal XML instance, using base XML syntax, and "turn" it into an object by adding the appropriate information in a separate schema, rather than having to recast the whole thing in a different syntax? (I wonder if I am expressing this idea clearly. I'll happily post an example of how this could be done if I'm not.) Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Dec 3 09:38:10 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:17 2004 Subject: Schemas and strongly typed links (Was Re: Object-oriented serialization) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <38469260.E7D18FA3@jfinity.com> Message-ID: <38478EF9.ED93CA0A@praxis.cz> Gabe Beged-Dov wrote: <snip> > XLink doesn't even allow you to use a namespace qualified name for the "role" (this may have > been fixed but it will be done as a new attribute value type like qname). It certainly > doesn't touch being able to specify a type for the property value. The XML Schema group may > end up supporting strongly typed references but I wouldn't be surprised if this fell off the > plate. This is really exactly what I am trying to get across. A tremendous amount of effort is being invested in RDF, on various levels (specing, implementation, evangelism, etc.). This shouldn't cause XML schemas to be poorer! I may be standing alone here (am I?), but to me it would be a minor tragedy if XML schemas did not support strongly typed links at the schema level. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 10:55:37 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:17 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: David Megginson's message of "Thu, 2 Dec 1999 16:32:36 -0500 (EST)" References: <14406.58740.871829.541816@localhost.localdomain> Message-ID: <whbt88tht4.fsf@viffer.oslo.metis.no> >>>>> David Megginson <david@megginson.com>: > 1. Use references when there can never be a null value, pointers > otherwise. Sounds reasonable. > 2. Pointers never change ownership -- if a Parser (for example) wants > to own an InputSource, it needs to make its own copy. The app has > to free everything that it allocates, and the SAX driver, likewise. A good basic practice. > 3. Callbacks cannot be const, since they often change the state of the > client app. Agree. > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility > with most existing C++ code. Disagree. This just defer the task of decoding from UTF-8 to UTF-16, which every forward-looking XML application eventually will have to do. For Asian languages this will also incur extra overhead, since I'm lead to belive they will mostly store documents as UTF-16, so that we will have a UTF-16 to UTF-8 to UTF-16 transformation through the SAX interface. (I currently have a SAX (or "SAXoid") C++ wrapper around expat, where I currently use plain std::string& to transfer text. But this is just a transitional stage until I manage to get full wide char support in the underlying system. (What I send through SAX isn't UTF-8, but ISO8859-1 with all unknown characters changed into ".", since this is all the underlying system understands)) > 5. Use char * rather than string, to avoid forcing a lot of allocation > overhead on the SAX driver. Hm... when I wrote my expat wrapper, I didn't even stop to think about this, since strings are so easy to use, and it would become a string in the first map<> lookup anyways. But I guess late evaluation is always a good thing (I'm using this heavily on the AttributeList, where no C++ objects will be created until someone asks for the first attribute). But I would rather see "const wchar_t*" (which I belive at least the Xerces-C uses) than "const char*". xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 10:58:37 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:17 2004 Subject: SAX/C++: Changes for C++ In-Reply-To: David Megginson's message of "Thu, 2 Dec 1999 16:38:11 -0500 (EST)" References: <14406.59075.218048.437305@localhost.localdomain> Message-ID: <wh7liwthnr.fsf@viffer.oslo.metis.no> >>>>> David Megginson <david@megginson.com>: > Here are some of the differences between the SAX/Java interfaces and the > SAX/C++ interfaces: > - InputSource doesn't have an equivalent of Java Reader (no getReader > method) I would like to be able to create a "push" stream, ie. something similar to a libwww stream, where data that arrives asynchronously will just be "pushed" to the parser as they arrive. expat already supports this, and I use it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 11:38:22 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: First interface draft In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <whso1ks19h.fsf@viffer.oslo.metis.no> >>>>> James Clark <jjc@jclark.com>: > One interesting issue is whether to provide a virtual destructor. I > think the safest solution is not to provide a virtual destructor but > instead to declare but not define a private operator delete. This > makes it a compile time error to do: > DTDHandler *p; > // ... > delete p; Hm... not defining a virtual destructor for a class with virtual functions gives me warnings in "gcc -Wall". Will a private operator delete do anything about these warnings, I wonder...? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Fri Dec 3 11:53:56 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: UTF-8 v UTF-16 References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> Message-ID: <008b01bf3d84$b037d650$c5010180@p197> > 2. A hi-tech solution. Do what the Standard C++ library does and make > the interface a template in the character type. This is the cleanest > solution, but lots of C++ projects eschew templates on portability > grounds. The Vivid C/C++ toolkit uses templates internally and so far has been compiled under Windows, Solaris and HPUX and a few others so I think the problem with templates is not so much of an issue these days(although it took us time to find the LCD for template support), but, I'd still probably avoid them in the SAX C/C++ definitions just in case. > If you feel that one needs to be mandated, I would pick UTF-16. I second that. The Vivid Creation SAX interfaces http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1 around 16 months ago ) and to date they've had nothing but positive feedback. I'd therefore make everything wchar_t and not char. I'd forget C as most platforms do have C/C++ and STL these days. Regards, Richard. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Toby.Speight at streapadair.freeserve.co.uk Fri Dec 3 12:09:09 1999 From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight) Date: Mon Jun 7 17:18:18 2004 Subject: A processing instruction for robots In-Reply-To: Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800" References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> Message-ID: <u1z941b1f.fsf@lanber.cam.citrix.com> Walter> Walter Underwood <URL:mailto:wunder@infoseek.com> 0> In article <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>, 0> Walter wrote: Walter> HTML has a robots meta tag. XML has no standard way to Walter> declare the same information. Here is a proposal, with an Walter> implementation: Walter> Walter> <URL:http://homepages.go.com/%7Ewunder0/robots-pi.html> Walter> Walter> Comments are welcome. It may be an idea to provide a NOTATION identifier for the processing instruction, rather than binding it to the specific word "robots". It depends on the trade-off you want to make between implementor convenience and author generality. If you've thought about it and decided against, it's probably worth a comment in your proposal explaining your rationale. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 12:36:08 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: UTF-8 v UTF-16 References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> Message-ID: <3847B8F3.D81B8286@jclark.com> Richard Anderson wrote: > > If you feel that one needs to be mandated, I would pick UTF-16. > > I second that. The Vivid Creation SAX interfaces > http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1 > around 16 months ago ) and to date they've had nothing but positive > feedback. I'd therefore make everything wchar_t and not char. Unfortunately wchar_t isn't guaranteed to be UTF-16. Some platforms make it 32-bits. However, I agree it's a good idea for SAXChar to be typedefed to wchar_t on platforms where wchar_t is UTF-16. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Dec 3 12:37:13 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: First interface draft References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whso1ks19h.fsf@viffer.oslo.metis.no> Message-ID: <3847B801.588A8797@jclark.com> Steinar Bang wrote: > > >>>>> James Clark <jjc@jclark.com>: > > > One interesting issue is whether to provide a virtual destructor. I > > think the safest solution is not to provide a virtual destructor but > > instead to declare but not define a private operator delete. This > > makes it a compile time error to do: > > > DTDHandler *p; > > // ... > > delete p; > > Hm... not defining a virtual destructor for a class with virtual > functions gives me warnings in "gcc -Wall". Will a private operator > delete do anything about these warnings, I wonder...? If not, gcc should be fixed, because there's no legitimate reason to give a warning. I got this technique from http://www.develop.com/dbox/cxx/SmartPtr.htm#Obvious I've also verified that this complies with the C++ standard (the relevant clause is 12.5p4). gcc 2.95.1 correctly gives a compile-error if you try to delete such a class. Visual C++ 6 doesn't catch this at compile, but you'll still get a link-time error. The other possible technique is a protected virtual destructor with an empty implementation. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 13:07:48 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: UTF-8 v UTF-16 In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 19:34:59 +0700" References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com> Message-ID: <whyabcqijv.fsf@viffer.oslo.metis.no> >>>>> James Clark <jjc@jclark.com>: > Richard Anderson wrote: >> > If you feel that one needs to be mandated, I would pick UTF-16. >> >> I second that. The Vivid Creation SAX interfaces >> http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1 >> around 16 months ago ) and to date they've had nothing but positive >> feedback. I'd therefore make everything wchar_t and not char. > Unfortunately wchar_t isn't guaranteed to be UTF-16. Some platforms > make it 32-bits. Yep! So I've heard. Do you have a list of the ones that does this? > However, I agree it's a good idea for SAXChar to be typedefed to > wchar_t on platforms where wchar_t is UTF-16. Hm... should we also to a typedef basic_string<SAXChar> SAXstring; (needs a better name, I lowercased the "s" in "string" to differ it from SAXString)? (pf course, then we would probably need SAXChar char_traits<> of some sorts as well...) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 13:15:00 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: First interface draft In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <whu2m0qi7l.fsf@viffer.oslo.metis.no> >>>>> James Clark <jjc@jclark.com>: > - Would it be better to typedef SAXString to the Standard C++ string > class (ie std::basic_string<SAXChar>)? An argument for using typdef const SAXChar* SAXString; is that you get late construction of the basic_string<>, ie. you don't create it until you have to (eg. when using it to do a lookup in an STL map<>). But ven if it's not used directly on the SAX interface, such a typedef would be useful when handling the data (eg. for creating the above mentioned map<>). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Veillard at w3.org Fri Dec 3 13:15:28 1999 From: Daniel.Veillard at w3.org (Daniel Veillard) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: <14406.58740.871829.541816@localhost.localdomain> References: <14406.58740.871829.541816@localhost.localdomain> Message-ID: <19991203081521.O2478@w3.org> On Thu, Dec 02, 1999 at 04:32:36PM -0500, David Megginson wrote: > Here are the principles that I applied to creating my first draft > SAX/C++ interface: I'm afraid I won't be able to provide this interface in libxml (the Gnome XML library http://xmlsoft.org/) due to the focus on C++, though a C++ wrapper on top should be able to provide it. > 2. Pointers never change ownership -- if a Parser (for example) wants > to own an InputSource, it needs to make its own copy. The app has > to free everything that it allocates, and the SAX driver, likewise. Very good idea, > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility > with most existing C++ code. Like James pointed out it's hard to segregate a class of users. UTF-8 compacteness will be appreciated by people wanting low memory overhead when building transaction processing. UTF-16 will simplify interfacing to DOM or using XML in UI oriented apps. > 5. Use char * rather than string, to avoid forcing a lot of allocation > overhead on the SAX driver. I did opt for the simple approach having an xmlChar type used everywhere except non XML content (filenames, errors messages ...). Having it 8 or 16 bits should be a compile-time (or run-time but that's more risky) option. Daniel -- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Dec 3 13:15:13 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:18:18 2004 Subject: Content or Metadata? References: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com> Message-ID: <3847C259.7E4B4653@prescod.net> Mike Spreitzer wrote: > > What about the list of authors of a scholarly paper? Isn't that metadata for which order > matters? Think of it from a programming language perspective: class doc: title: string published: date authors: list of string text: list of (para|list|img) The authors property is unordered with respect to the other properties but its domain is ordered. The *list of authors* is metadata for the doc object. In grove land we allow a single, particular property to be labeled as the content property. In this case it would be the "text:" property. In a language like Python, you would navigate "regular (metadata)" properties like this: doc.publisher.address.street and content properties like this: doc[5][3][2][4] In the former, the name is significant. In the latter the position is significant. All of this is explained at: http://www.prescod.net/groves/shorttut -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Fri Dec 3 13:20:58 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:18 2004 Subject: Request for Discussion: SAX 1.0 in C++ References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> <38475BB6.662858CF@jclark.com> Message-ID: <002f01bf3d91$d254fa80$0ffbb1cd@tomshp> From: James Clark <jjc@jclark.com> > > The bigger problem is that the SAX style of interface goes over quite > naturally into C++, but would be rather awkward in C. > Well, that's for sure. Maybe it's not worth the effort. I was mainly thinking about portability to smaller or rarer platforms (Amiga, Palm, etc). Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 13:20:44 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:18 2004 Subject: parser asynch input (Was: SAX/C++: First interface draft) In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no> >>>>> James Clark <jjc@jclark.com>: > class Parser > { > public: > virtual void setLocale (const char *) = 0; > virtual void setEntityResolver (EntityResolver &resolver) = 0; > virtual void setDTDHandler (DTDHandler &handler) = 0; > virtual void setDocumentHandler (DocumentHandler &handler) = 0; > virtual void setErrorHandler (ErrorHandler &handler) = 0; > virtual void parse (SAXString systemId) = 0; > virtual void parse (const InputSource &input) = 0; > private: > void operator delete (void *); > }; I would like to add operations that can be used to "push" data to the parser asynchronously: class Parser { public: virtual void setLocale (const char *) = 0; virtual void setEntityResolver (EntityResolver &resolver) = 0; virtual void setDTDHandler (DTDHandler &handler) = 0; virtual void setDocumentHandler (DocumentHandler &handler) = 0; virtual void setErrorHandler (ErrorHandler &handler) = 0; virtual void parse (SAXString systemId) = 0; virtual void parse (const InputSource &input) = 0; virtual void asynchId(SAXString systemId) = 0; //for error reporting virtual void asynchPutBlock(const char* buf, int len) = 0; private: void operator delete (void *); }; xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Wdehora at cromwellmedia.co.uk Fri Dec 3 13:23:55 1999 From: Wdehora at cromwellmedia.co.uk (Bill dehOra) Date: Mon Jun 7 17:18:18 2004 Subject: INTERFACE {was SGML, XML and SML, ugh!} Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A5CA3CD@odin.cromwellmedia.co.uk> : The other day, I had this idea about solving the : display problem for mobile computering. While I : do not think it is implementable right now, I do : not think it is impossible. : : The problem: large displays for mobile device. : The solution: public display walls that shows : different views to to different people simul- : taneously. Multithreaded display of sort. <g> : : Technology wise, I think the pixels will have to : protrude like a small pyramid to show multiple : views and multiplexed to coincide with the viewer's : eyeglass which includes LCD shutters. Interesting : stuff to muse about. Not much privacy barrier but : I think it might make sense in certain applications. : That's a cool idea, but unnecessary, given that we will have HUD's *within* spectacles in a decade or so, making the wall displays redundant. Combine spectacles with headphones and voice recognition, maybe with a small keypad you have a complete mobile computing environment. You can get context driven information about the environment (a la GSM), giving you an augmented reality, which IMHO will dwarf what is happening with the internet now. All we really need are some advances in battery technology, and reality is in for a comeback. Anyone for RML? regards, Bill de hOra xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Fri Dec 3 13:52:34 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:18 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: "Thomas B. Passin"'s message of "Fri, 3 Dec 1999 08:25:02 -0500" References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> <38475BB6.662858CF@jclark.com> <002f01bf3d91$d254fa80$0ffbb1cd@tomshp> Message-ID: <wh903cqgha.fsf@viffer.oslo.metis.no> >>>>> "Thomas B. Passin" <tpassin@idsonline.com>: > From: James Clark <jjc@jclark.com> >> The bigger problem is that the SAX style of interface goes over >> quite naturally into C++, but would be rather awkward in C. > Well, that's for sure. Maybe it's not worth the effort. I was > mainly thinking about portability to smaller or rarer platforms > (Amiga, Palm, etc). The W3C libwww structured stream, is pretty close to the DocumentHandler, at least: http://www.w3.org/Library/src/HTStruct.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From steven.livingstone at scotent.co.uk Fri Dec 3 13:52:24 1999 From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM) Date: Mon Jun 7 17:18:18 2004 Subject: XML processing instruction survey Message-ID: <8DCB90532FF7D211B34400805FD48853B56DFB@SENMAIL3> Microsoft make an interesting use of it in their technology preview for SQL Server XML. The output of a transform is cached and the PI "servercache" is used to determine how long the output should be cached for. Interesting. Cheers, Steven Steven Livingstone - http://www.deltabiz.com 07771 957 280 or +447771957280 Professional Site Server 3, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696 Professional Site Server 3.0 Commerce Edition, Wrox Press http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505 > -----Original Message----- > From: Jeffrey E. Sussna [SMTP:jes@kuantech.com] > Sent: 30 November 1999 23:08 > To: 'XML Dev' > Subject: XML processing instruction survey > > I'm interested in the extent to which people are actually using the XML > processing instruction ( <?xml ) in their XML files, and the extent to > which > they find it useful. > > Jeff Sussna > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:00:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:06:24 +0700" References: <14406.58740.871829.541816@localhost.localdomain> <384741C0.50ABA536@jclark.com> Message-ID: <m3zovsi0rr.fsf@localhost.localdomain> James Clark <jjc@jclark.com> writes: > That's problematic for EntityResolve::resolveEntity; that requires that > ownership of an InputSource be transferred from to the caller from the > callee. > > This could be avoided by doing: > > virtual const InputSource * > resolveEntity(const char *publicId, > const char *systemId); > > instead of: > > virtual void > resolveEntity(const char *publicId, > const char *systemId, > InputSource &inputSource); (I'll assume that James accidentally reversed the two). The second one is a very good idea -- the only modification I'd make is to add a bool return value, so that the parser knows whether the resolver actually wants to override: virtual bool resolveEntity(const char *publicId, const char *systemId, InputSource &inputSource); All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:05:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: Daniel Veillard's message of "Fri, 3 Dec 1999 08:15:21 -0500" References: <14406.58740.871829.541816@localhost.localdomain> <19991203081521.O2478@w3.org> Message-ID: <m3wvqwi0j7.fsf@localhost.localdomain> Daniel Veillard <Daniel.Veillard@w3.org> writes: > On Thu, Dec 02, 1999 at 04:32:36PM -0500, David Megginson wrote: > > Here are the principles that I applied to creating my first draft > > SAX/C++ interface: > > I'm afraid I won't be able to provide this interface in libxml > (the Gnome XML library http://xmlsoft.org/) due to the focus on C++, > though a C++ wrapper on top should be able to provide it. Yes, that would be the best approach -- both libXML and Expat should be easy to wrap in SAX/C++ adapters (XML4C++ will probably have its own SAX/C++ support). It would be interesting for apps to be able to switch between libXML and Expat and compare performance, features, etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:09:12 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: RDF, again In-Reply-To: Vane Lashua's message of "Thu, 2 Dec 1999 17:50:33 -0500" References: <A51F7543E295D2118D6600A024CDB2F71B9D72@MAILPROD> Message-ID: <m3u2m0i0cj.fsf@localhost.localdomain> Vane Lashua <vlashua@RSGsystems.com> writes: > The difficulty with the definitions below, for instance, is that "name" is a > collection of characters whose context is not clear without a reference. > Namespaces, it seems to me, are absolutely necessary, but they tend to > encourage diversity where convergence would be a more enlightened tendency. Namespaces encourage innovation. Innovation is the first stage in development, and it needs to be followed by standardization where demand warrants. In ordinary language, Namespaces let people invent stuff, but it's our responsibility to look at what's being invented and standardize the things that are being done over and over again. It's a good idea to let the market have a say first; if you skip the innovation stage and try to standardize in advance, your standards will often miss the mark. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:19:21 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: First interface draft In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 19:30:57 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whso1ks19h.fsf@viffer.oslo.metis.no> <3847B801.588A8797@jclark.com> Message-ID: <m3ogc8hzvm.fsf@localhost.localdomain> James Clark <jjc@jclark.com> writes: > The other possible technique is a protected virtual destructor with an > empty implementation. That might be a little clearer to less experienced C++ programmers (like me). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:17:45 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: SAX/C++: First interface draft In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <m3r9h4hzyb.fsf@localhost.localdomain> James Clark <jjc@jclark.com> writes: > In Java, everything in SAX is an interface. The way to do an interface > in C++ is to use a class where all members (except possibly a virtual > destructor) are abstract (ie defined as = 0). This provides the maximum > flexibility and insulation. The only good reason not to do an interface > is if it were necessary and possible to inline some method calls for > performance. I think this this applies here: certainly there's no > performance need to inline method calls to something like InputSource. Actually, I don't see any strong argument not to provide empty inline implementations for the handler callbacks: virtual void startDocument (void) {} virtual void endDocument (void) {} virtual void startElement (const char * name, const AttributeList &atts) {} virtual void endElement (const char * name) {} (etc.) That way, subclasses can implement only the callbacks they need, and there's no need for a HandlerBase class. > One interesting issue is whether to provide a virtual destructor. I > think the safest solution is not to provide a virtual destructor but > instead to declare but not define a private operator delete. This makes > it a compile time error to do: > > DTDHandler *p; > // ... > delete p; > > Given the policy on object ownership there's never any need to do that: > only the creator of an object can delete it and the creator always has a > pointer to the concrete subclass which will provide a way to release the > object. I appreciate James sharing some of his C++ experience here. This sounds like a good idea to me, but I'm at best a C++ journeyman, so I'd be happy to hear from other masters on the list. > It also has the nice property that there is no .cpp file associated with > the SAX interface and no SAX library that has to be compiled or linked > with. It would be a completely pure interface. Yes, I'd like this as well. > Here's another draft, with this change and a few other minor changes; > > - use int not size_t (Lakos has a whole section on why unsigned in > interfaces is usually a bad idea) OK -- I thought that I was being well-behaved using size_t. Oh, well. > - use a SAXString typedef for zero-terminated arrays Sounds good, if slightly obfuscatory. > - don't use (void) for empty argument lists What are the arguments for and against? > - use iosfwd not istream as the header file I have no idea why, but I'll take James's word on this. > - use characters not SAXCharacters as the method name on DocumentHandler > - use a const char * arg for Parser::setLocale; I think that's the best > you can do portably; Standard C++ allows locales to be identifier by > name Thanks. > - add Locator Yes, I forgot it. > - change resolveEntity to avoid transfer of ownership as suggested in my > previous message Perfect, except that it might be nice to use bool as the return value, as I suggested, so that the parser isn't forced to examine the InputSource if the app hasn't made any changes. > - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8 > and a SAX_UTF16 namespace (since you're using std::istream, you are > assuming compiler support for namespaces); this will work nicely with > namespace aliases (eg namespace SAX = SAX_UTF8). Ouch! We might be getting a little hairy here. How is Namespace support out there, by the way? I know that EGCS is pretty good on Linux, though the std:: Namespace still isn't properly supported. > Discussion points: > > - Would it be better to typedef SAXString to the Standard C++ string > class (ie std::basic_string<SAXChar>)? Do we want to force that overhead on an app? I need to understand better if there will be a high cost to calling c_ptr over and over again, if the app needs a regular zero-terminated character array. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:22:02 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:18 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Matthew Gertner's message of "Fri, 03 Dec 1999 10:30:27 +0100" References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> Message-ID: <m3ln7chzrd.fsf@localhost.localdomain> Matthew Gertner <matthew@praxis.cz> writes: > David Megginson wrote: > > How does the schema tell me that foo represents a container for a > > collection of objects, bar represents an object, and hack and flurb > > represent the object's properties? > > The point is not what the current schema draft allows, it is whether it > would be feasible and appropriate to represent this information in XML > schemas, as Paul rightly stated. My opinion is that it would be fairly > trivial and extremely useful. Would you require schema processing, then, for object exchange (or in other words, would there be no equivalent of the DTD-less XML document)? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Colas.Nahaboo at sophia.inria.fr Fri Dec 3 14:22:41 1999 From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo) Date: Mon Jun 7 17:18:18 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Your message of "Thu, 02 Dec 1999 10:24:04 +0100." <38463AB4.36C5292B@praxis.cz> Message-ID: <199912031422.PAA22478@aye.inria.fr> Matthew Gertner writes: > The aspects of > object-oriented design that are missing are then inheritance and > polymorphism. In my opinion, things may be more simple. If SGML/XML had not been designed by people living in a typeless world (text documents), XML could have provided a much better medium to express object instances, with such a simple design as getting rid of element contents, and allowing attributes contents to be XML, e.g: <Point x=<Length unit="inches" value="12"/> y=<Length unit="cm" value="2"/> color=<RGB R=<Number base="16" value="FF"/> G="0" B="0"/> /> matching a C/C++/Java... declaration of Point as: Point { Length x; Length y; Color color; }; As you can see, this would be a very elegant and natural way to express object instances (aka serialization). One would of course need a schema language on top of that (to express what I wrote in a C-like declaration), but having to tweak the "low-level" serialisation to fit in the current XML1.0 recomendation is I think the original sin of XML, which pollutes a lot of the discussions I see here. For instance the current drive for removing attributes results from this. But, just like RDF, we could standardise on this non-XML-1.0-compatible (lets call it GXML for Generalized XML :-) representation, and devise a canonical way to express it in XML. For instance: <Point> <_x><Length unit="inches" value="12"/></_x> <_y><Length unit="cm" value="2"/></_y> <_color><RGB G="0" B="0"> <_R><Number base="16" value="FF"/></_R> </_color> </Point> but we could devise others... -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Fri Dec 3 14:23:18 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:18:19 2004 Subject: INTERFACE {was SGML, XML and SML, ugh!} In-Reply-To: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD> Message-ID: <199912031422.JAA01701@westnet.com> At 02:43 PM 12/2/99 -0500, Vane Lashua wrote: >Around the same era that the mouse emerged, there was on the market a >single-handed(?) encoding device whose speed was about the same as querty. I >think I saw it in Byte. Anybody seen one lately? > Yes, they're alive and well. The Twiddler (gotta love the name) is probably the best known, as they've been written up a few times as the keyboard of choice for the people at MIT developing wearable computing gear. See www.handykey.com for more info. Other one-handed chording keyboards available include CyKey from Bellaire Electronics (www.bellaire.demon.co.uk), the MonoManus from ElmEntry Enterprises (www.hankes.com/eee/index.htm) and the Bat Personal Keyboard (www.infogrip.com). Prices for these are typically in the $150 - $200 U.S. range. With practice, people using one handed chording keyboards can usually get typing speeds approaching 60 wpm, which is about the low-end of professional touch-typing speeds on QWERTY boards. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Dec 3 14:49:20 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:19 2004 Subject: Web Vision (Was Re: Object-oriented serialization) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <3.0.6.32.19991202133035.04315e60@mail.dt1.sdca.home.com> Message-ID: <3847D7E2.A4AE2B48@praxis.cz> Robert La Quey wrote: > The basic problem remains a lack of a clearly articulated vision of what > the web of the future could/should be. Okay, Bob, let me take a crack at this. As will no doubt be obvious to anyone who has read my last few posts, I am quite sceptical as to the real value of RDF. My hope is that XML schemas will come to be viewed as the right mechanism for specifying object-oriented structures, turning XML instances into object serializations with all this implies. I said some nasty things about RDF in my "Pleas for Schemas" (www.praxisxml.com/praxis_xml.html) and was frankly quite disappointed that no RDF proponants stepped up to defend their case. The current discussion in this context is therefore very interesting and edifying for me. Anyway, my personal vision for the "new" (i.e. XML-enhanced) Web is quite simple. Let's look at the pieces one by one: * Namespaces -- are there to specify unique names. No more, no less. They should be orthogonal to schemas, so a single schema can use several namespaces and vice versa. The choice of URIs to uniquely scope names is clever and elegant, but there needs to be a specification of what is at the end of the URI. Apparently people get really confused otherwise. Something like Simon St. Laurent's XPDL would be a great choice, but oriented towards namespaces and not schemas and pointing to further resources that might be of interest (such as schemas that use the namespace and human-readable documentation). * XML -- is for representing tree structures in text format. * XML schemas -- are for turning XML instances into objects, adding information about the semantic relationships between element types. They are also repositories for business logic related these element types. I strongly contest the view that business logic can only be represented in a messy combination of human-readable documentation and running code. Schemas can provide a huge amount of semantic information in various ways: - Semantics of element type relationships (e.g. one element type is an attribute of another, and not just contained in it) - Plausibility constraints (e.g. allowed data ranges or regexs for string; this is already is the spec) - XPath constraints. I think Rick Jelliffe's Schematron is a brilliant idea that would bring heaps of benefits when embedded inside an XML schema. Many types of nontrivial semantics can be expressed using XPath and linked directly to a given element type. - Opt-outs, for example links to Java classes that do further processing that cannot be expressed descriptively. At this point we have "lost" in terms of representing full application semantics inside a schema, but at least we have a central location for binding logic in a descriptive way to the applications data structures. * Stylesheets -- transform XML documents or render them in various human-readable formats. In this view, schemas provide the application logic to let XML actually do something. Websites become aggregators of XML documents (many of which will be generated dynamically from database content), the schemas tell the processing application (which might be on the server or on the client) what to do with the document. For example, I could write a servlet that, based on an arbitrary XML schema, turns an XML document into an abstract form definition (also XML) that can be processed by a generic "form" stylesheet and turned into an HTML form, a WAP form or whatever. This means I can turn any schema into an input form instantly with no programming whatsoever (and this works just as well for reports, query forms and what-have-you). So what we have are a bunch of objects in the form of XML documents zipping around the Web. Some are rendered for human viewing, some are consumed by other applications. Schemas are the glue that ties this altogether, doing two things: 1) Providing object and application semantics to the XML instances and 2) Serving as the unit for distributing, reusing and extending these semantics. I'd be interested to hear how the RDFers see things in relation to this vision. I know it's a little half-baked, but I don't want to write a 20-page mail and I'm sure no one wants to read it. I am working on something more detailed to be posted on our website; we have tons of ideas and plans in this direction. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tshw at capitalmarketscompany.com Fri Dec 3 14:50:30 1999 From: tshw at capitalmarketscompany.com (Shaw Tim) Date: Mon Jun 7 17:18:19 2004 Subject: RDF, again Message-ID: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com> Given a piece of data, described by a name within a namespace, I would like to be able to determine the 'equivalent' data (and it's name) within another namespace. I don't mind 'finding' the data, but I need to be able to determine it's 'new' name within the new namespace. I think that, the way things are going (and have always been), it's necessary to have such a mechanism. From my (limited) understanding of RDF, it's not able to give me the 'hook' to do this. This, to my mind, is the equivalent of automated natural language translation, where XML is the alphabet, the Namespace is the Dictionary and DTD's or Schemata are the Grammar (and the data is the ...? - analogy breaks down here in my overloaded braincell!). We (humans) have always had a problem communicating across language boundaries - can we not define a mechanism such that we do not propagate this problem to XML? People will continue to define their own grammars for specific purposes - what about asking them to 'translate' their Dictionary into Esperanto so that others can use their ... whatever words would map to (meaning?). I'm all for encouraging diversity, but if people want to share something they need to be able to define it in such a way that it can be understood by others - without having to worry exactly _which_ others. We are in danger of splitting it all up into dialects understandable only to small groups - and that, it seems to me, is exactly why we went for XML in the first place! tim > -----Original Message----- > From: David Megginson [mailto:david@megginson.com] > Sent: 03 December 1999 15:08 > To: 'xml-dev@ic.ac.uk' > Subject: Re: RDF, again > > > Vane Lashua <vlashua@RSGsystems.com> writes: > > > The difficulty with the definitions below, for instance, is > that "name" is a > > collection of characters whose context is not clear without > a reference. > > Namespaces, it seems to me, are absolutely necessary, but > they tend to > > encourage diversity where convergence would be a more > enlightened tendency. > > Namespaces encourage innovation. Innovation is the first stage in > development, and it needs to be followed by standardization where > demand warrants. > > In ordinary language, Namespaces let people invent stuff, but > it's our > responsibility to look at what's being invented and standardize the > things that are being done over and over again. It's a good idea to > let the market have a say first; if you skip the innovation stage and > try to standardize in advance, your standards will often miss > the mark. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: > http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > ********************************************************************* The information in this email is confidential and is intended solely for the addressee(s). Access to this email by anyone else is unauthorised. If you are not an intended recipient, you must not read, use or disseminate the information contained in the email. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of The Capital Markets Company. http://www.capitalmarketscompany.com *********************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 14:54:17 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:19 2004 Subject: RDF, again In-Reply-To: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com> References: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com> Message-ID: <14407.55626.211719.951880@localhost.localdomain> Shaw Tim writes: > Given a piece of data, described by a name within a namespace, I would like > to be able to determine the 'equivalent' data (and it's name) within another > namespace. I don't mind 'finding' the data, but I need to be able to > determine it's 'new' name within the new namespace. > I think that, the way things are going (and have always been), it's > necessary to have such a mechanism. From my (limited) understanding of RDF, > it's not able to give me the 'hook' to do this. See http://www.w3.org/TR/PR-rdf-schema This has long been delayed from going to REC for political rather than technical reasons, but I think that things are looking a little sunnier now. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Fri Dec 3 15:19:09 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:18:19 2004 Subject: SAX/C++: First interface draft References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain> Message-ID: <008601bf3da1$5b6e5690$c5010180@p197> > Actually, I don't see any strong argument not to provide empty inline > implementations for the handler callbacks: > > virtual void startDocument (void) {} > virtual void endDocument (void) {} > virtual void startElement (const char * name, const AttributeList &atts) {} > virtual void endElement (const char * name) {} I think the "= 0" is far more normal in the C/C++. I personally prefer that approach because if I derive a class from the parser class, and do a typo in method implementation in the derived class I get a compile error, rather than spending time figuring out the empty base class implementation is being called. Having a handler base is fine by me. If we do go for the "= 0" approach I suggest the class be prefixed with an "I" (for interface). Those who work with COM will understand that rational behind that ;) > I appreciate James sharing some of his C++ experience here. This > sounds like a good idea to me, but I'm at best a C++ journeyman, so > I'd be happy to hear from other masters on the list. The other option is : protected: virtual ~CSAXParser() {} Of course, it is nice to be able to delete things so should the class should probably include a delete or release method: virtual void delete() = 0; That can then be implemented using referencing counting if so desired or a straight delete. > > - use iosfwd not istream as the header file I've not used isofwd, but why not just define a SAX input stream class : class CInputStream { public: virtual int ReadChar( unsigned char& ch ) = 0; ... }; I for one have implemented by SAX support this way so people using the toolkit can implement streams however they see fit, esp. if STL support on their platform is a but shaky. > > - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8 > > and a SAX_UTF16 namespace (since you're using std::istream, you are > > assuming compiler support for namespaces); this will work nicely with > > namespace aliases (eg namespace SAX = SAX_UTF8). We've had problems with namespaces not being supported with some compilers, so it is best to avoid them. That is the reason why I suggest all interface/class names are prefixed with SAX, CSAX or better still ISAX Regards, Richard. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Colas.Nahaboo at sophia.inria.fr Fri Dec 3 15:32:32 1999 From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo) Date: Mon Jun 7 17:18:19 2004 Subject: [SML] Whether to support Attribute or not? In-Reply-To: Your message of "Mon, 29 Nov 1999 02:59:23 EST." <Pine.LNX.4.10.9911290225570.2129-100000@cauchy.clarkevans.com> Message-ID: <199912031531.QAA22952@aye.inria.fr> "Clark C. Evans" writes: > Take the following HTML fragment: > <table border="2" cellpadding="50"> > <tr><td>One</td><td>Two</td></tr> > <tr><td colspan="2">Three</td></tr> > </element> > I clearly see the different role that content plays > as opposed to attributes. The border and cellpadding > attributes *modify* the state of the table; where > the tr element content is *part-of* the table. mmm, if you really look at it, things are mudded because you forget that <table> is an object having a field "rows", which is a list of elements of "type" <tr>, and that <tr> is an object having a field named "cells" having a list of <td>s as values. Now, replace the word "field" by "attributes" and you see that contents is actually a kind of attribute, with its real name omitted and implicit (actually, explicited somewhat in the DTD. Everything is confused in XML because everything is of type "text", that you mix n match everywhere. For me, your exemple *should* be written in an ideal XML 2: <table border="2" cellpadding="50" rows= <tr cells=<td contents="One"/><td contents="Two"/>/> <tr cells=<td colspan="2"contents="Three"/>/> /> -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Dec 3 15:54:21 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> Message-ID: <3847E71D.3F32FFA9@praxis.cz> David Megginson wrote: > Would you require schema processing, then, for object exchange (or in > other words, would there be no equivalent of the DTD-less XML > document)? Yes and no. A schema-less XML document is absolutely fine for most things. If you need object semantics then you need a schema. The good news is that, for things like metadata attached to documents, the number of schemas will most likely be small and well-known. If I have the schema for the Dublin Core on my machine already, for example, then I can interpret any instance based on the Dublin Core. Obviously the knowledge about the schema can be compiled into specific classes for processing instances based on this schema, as I already mentioned. This will no doubt be necessary for efficiency; it would be unrealistic to reparse the schema every time an instance needs to be processed. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vlashua at RSGsystems.com Fri Dec 3 16:06:44 1999 From: vlashua at RSGsystems.com (Vane Lashua) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D76@MAILPROD> I think you're mixing apples and oranges. An even simpler declaration of your example below -- and correct in XML -- would be: <Point value="12in,2cm;RFFx,G0,B0" /> or: <processingsegment lang="Java" content="Point {Length x; Length y; Color color;};" /> or: <? Java Point {Length x; Length y; Color color;}; ?> XML is a storage medium. Java source code is a storage medium. XML may contain Java source code syntax, as Java source code may contain XML syntax, but both need processors to do more. And by the way, SGML grew out of a world of extremely limited and narrowly typed data processing _and_ fixed length records. Data typing in SGML is as simple as adding an attribute to an element declaration. It is up to a processor to know how to use it. Just as it is in Java. Vane -----Original Message----- From: Colas Nahaboo [mailto:Colas.Nahaboo@sophia.inria.fr] Matthew Gertner writes: > The aspects of > object-oriented design that are missing are then inheritance and > polymorphism. In my opinion, things may be more simple. If SGML/XML had not been designed by people living in a typeless world (text documents), XML could have provided a much better medium to express object instances, with such a simple design as getting rid of element contents, and allowing attributes contents to be XML, e.g: <Point x=<Length unit="inches" value="12"/> y=<Length unit="cm" value="2"/> color=<RGB R=<Number base="16" value="FF"/> G="0" B="0"/> /> matching a C/C++/Java... declaration of Point as: Point { Length x; Length y; Color color; }; As you can see, this would be a very elegant and natural way to express object instances (aka serialization). One would of course need a schema language on top of that (to express what I wrote in a C-like declaration), but having to tweak the "low-level" serialisation to fit in the current XML1.0 recomendation is I think the original sin of XML, which pollutes a lot of the discussions I see here. For instance the current drive for removing attributes results from this. But, just like RDF, we could standardise on this non-XML-1.0-compatible (lets call it GXML for Generalized XML :-) representation, and devise a canonical way to express it in XML. For instance: <Point> <_x><Length unit="inches" value="12"/></_x> <_y><Length unit="cm" value="2"/></_y> <_color><RGB G="0" B="0"> <_R><Number base="16" value="FF"/></_R> </_color> </Point> but we could devise others... -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Fri Dec 3 16:26:26 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:18:19 2004 Subject: Any XML Schemas validators out yet ? Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2781C0@MDYNYCMSX1> >I tried this but MS Explorer 5.0.2919 reports this error in the XML Schema proposal: >Attribute 'xmlns:' must be a #FIXED attribute. Line 17, Position 18 >Maybe Im using the wrong Schema , >"http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.dtd" ? Using (X/NT)emacs+PSGML and then the IBM (and now Xerces) validating Java parser, all the WD-xmlschema* drafts worked for me. I've only dabbled in the most simplistic levels of XML support in IE so far, so I don't know what the problem would be there. Bob DuCharme www.snee.com/bob <bob@ snee.com> "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From KenNorth at email.msn.com Fri Dec 3 16:41:47 1999 From: KenNorth at email.msn.com (KenNorth) Date: Mon Jun 7 17:18:19 2004 Subject: Rocket framework for creating Web sites Message-ID: <000001bf3dad$a0c0f520$0b00a8c0@grissom> Michael Floyd just released the Rocket framework: "In a nutshell, Rocket is a collection of skeleton XML documents, XSL style sheets, and DTD's that you can use as a basis for creating your own XML-based Web site. Using Rocket, you can transform XML documents and serve them to any browser, regardless of its capabilities. Rocket also allows you exchange XML streams between XML-capable browsers and HTTP servers." Check his BeyondHTML site: http://www.beyondhtml.com/rocket/rocket.xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Fri Dec 3 16:55:54 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:19 2004 Subject: A processing instruction for robots In-Reply-To: <u1z941b1f.fsf@lanber.cam.citrix.com> References: <Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800"> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> Message-ID: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com> At 12:09 PM 12/3/99 +0000, Toby Speight wrote: > >It may be an idea to provide a NOTATION identifier for the processing >instruction, rather than binding it to the specific word "robots". It >depends on the trade-off you want to make between implementor convenience >and author generality. If you've thought about it and decided against, >it's probably worth a comment in your proposal explaining your rationale. Good point. Since the target of the PI is "any robot that cares", the notation would need to point to something other than a particular robot, probably the spec. That is a namspace-like use of the notation. In that case, should the spec require that processors check for the correct notation before interpreting the PI? I'm a bit wary of making things more complex. The robot world is the natural home of the Desparate Perl Hacker, so I'd like the spec to be understandable in 30 seconds or less. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 3 17:01:39 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:19 2004 Subject: [SML] Whether to support Attribute or not? In-Reply-To: <199912031531.QAA22952@aye.inria.fr> Message-ID: <Pine.LNX.4.10.9912022359110.18460-100000@cauchy.clarkevans.com> On Fri, 3 Dec 1999, Colas Nahaboo wrote: > For me, your exemple *should* be written in an ideal XML 2: > > <table border="2" cellpadding="50" > rows= <tr cells=<td contents="One"/><td contents="Two"/>/> > <tr cells=<td colspan="2"contents="Three"/>/> > /> First John's syntax mutation, <element <att <nested>val</nested> > /> and now your syntax mutation, <elemetn att=<nested>val</nested> /> Pretty. Yum. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Colas.Nahaboo at sophia.inria.fr Fri Dec 3 17:11:09 1999 From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:27 EST." <A51F7543E295D2118D6600A024CDB2F71B9D76@MAILPROD> Message-ID: <199912031710.SAA22137@koala.inria.fr> Vane Lashua writes: > I think you're mixing apples and oranges. I see it the other way: I try to make people realize that they are the same, and the current artifical limits in the XML syntax make people stuble on artificial syntax problems. > An even simpler declaration of your example below -- and correct in XML -- > would be: > <Point value="12in,2cm;RFFx,G0,B0" /> This is not XML. You invent a sub-language to describe the contents of the value attribute. You will then need XML and a XML parser to understand the outer XML, and you will have to invent and specify the inner language, and design and implement the parser, which is *more* complex than an unified "XML 2" language. (note that SVG did just that with the contents of the path element :-). People tend to invent plenty of these sub-languages and mentally hide them under the rug, failing to see that the did not simplified anything, just made things more complex at more places in many different - and often unspecified - ways. > XML is a storage medium. Java source code is a storage medium. XML may > contain Java source code syntax, as Java source code may contain XML syntax, > but both need processors to do more. Yep, but if you look at my example, you could see that I got rid of any sub-language!!! I only need an XML parser, nothing else at the parsing level. I still need the upper semantic level, of course, but at least I dont have to have plenty of different lexical parsers (and specs) to describe my data. The SVG example is striking. To implement a SVG viewer, you need to have an XML parser, plenty of other parsers to parse the sub-languages invented in the different attributes and contents of the SVG XML, including a full CSS and HTML parser... Note that I descibed only the object instances, NOT the classes structures (this belongs to schemas, not to the XML level), and they are not java, they could represent C++, common lisp, python,... objects! -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Dec 3 17:14:24 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> Message-ID: <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> > Yes and no. A schema-less XML document is absolutely fine for most > things. If you need object semantics then you need a schema. But a schema doesn't tell you the semantics (although certain schema languages might tell you how certain element types relate to others). When a human devises FooML, they generally come up with a vocabulary of labels (perhaps made universally unique via namespaces), a bunch of syntactic constraints (a schema), some human prose describing what the labels mean, and maybe some code for doing cool stuff with FooML documents. (Ultimately the human prose would probably be used by other people to write code for doing cool stuff, too) The semantics are in the human prose and the actions the code performs. Not the schema. James Tauber xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Toby.Speight at streapadair.freeserve.co.uk Fri Dec 3 17:45:10 1999 From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight) Date: Mon Jun 7 17:18:19 2004 Subject: A processing instruction for robots In-Reply-To: Walter Underwood's message of "Fri, 03 Dec 1999 08:55:16 -0800" References: <Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800"> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <u1z941b1f.fsf@lanber.cam.citrix.com> <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com> Message-ID: <uogc7ncky.fsf@lanber.cam.citrix.com> Walter> Walter Underwood <URL:mailto:wunder@infoseek.com> 0> In article <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>, 0> Walter wrote: Walter> At 12:09 PM 12/3/99 +0000, Toby Speight wrote: >> It may be an idea to provide a NOTATION identifier for the >> processing instruction, rather than binding it to the specific >> word "robots". It depends on the trade-off you want to make >> between implementor convenience and author generality. If >> you've thought about it and decided against, it's probably >> worth a comment in your proposal explaining your rationale. Walter> Good point. Since the target of the PI is "any robot that Walter> cares", the notation would need to point to something other Walter> than a particular robot, probably the spec. That is a Walter> namspace-like use of the notation. That's exactly how I see it, too (though I don't want to use the phrase "point to" as I consider myself strictly on the fence of the great "Namespace As Locator" debate - "identify" may be a better word). Walter> In that case, should the spec require that processors check Walter> for the correct notation before interpreting the PI? Not only that, you may choose to specify that a PI with that notation should be honoured, *no matter what the local name is in the document*. I don't think this would fly with the DPH, though. Walter> I'm a bit wary of making things more complex. The robot world Walter> is the natural home of the Desparate Perl Hacker, so I'd like Walter> the spec to be understandable in 30 seconds or less. This is the argument against doing it the completely generalised, indirect way (the SGML Way). It may mean that you have to decide that it's asking too much to expect the indexers to process PIs according to their notations. What I'm saying is that this has to be an informed decision, and the spec should clearly report the alternatives considered - but I know *I'm* not qualified to actually decide. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Dec 3 17:43:47 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> Message-ID: <38480038.6028FEF2@praxis.cz> James Tauber wrote: > > > Yes and no. A schema-less XML document is absolutely fine for most > > things. If you need object semantics then you need a schema. > > But a schema doesn't tell you the semantics (although certain schema > languages might tell you how certain element types relate to others). I posted a mail early today entitled "Web Vision". It explains more clearly what I have in mind. It is a relatively new idea but I feel that it is possible to pack a lot of useful semantics into a schema. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Erlend.Overby at usit.uio.no Fri Dec 3 18:04:18 1999 From: Erlend.Overby at usit.uio.no (Erlend Øverby) Date: Mon Jun 7 17:18:19 2004 Subject: SGML the next big thing? Message-ID: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no> I think it is time to take a short break, raise the head and have a look at the XML-landscape. To me it seems that we are trying to reinvent SGML, but in a much more complex way. >From SGML we need the following features. - Groves - HyTime - TopicMaps - Subdoc - Architectural Forms - #CONREF - #NOTATIONS - #CURRENT - Public Identifiers What we don?t need from the SGML standard: - SGML Declaration - Character entities - Minimisation - The "&" construct Getting rid of these "features" will make it much easier to process and implement systems based on SGML. >From XML we need the following features: - Unicode - /> for empty elements - The concept of well formed - XSL - XSLT I would like to try to show how some of the features from SGML coul be used: Architectural forms: To be able to exchange information in a proper manner, the industry has to agree on a Architecture (Common DTD). This will avoid the need for transformations between different DTDs, since the information is based on the same architecture. #NOTATIONS: This could be used to inform the processor about what kind of information this is. It could be of type MathML, Chemical Markup, HTML, TIFF etc. Or the information should be processed after the ISO8601 DATE specifications etc. Just to give an idea. It is time to combinate the best from SGML and XML. Today XML is too simple; it lacks several important features needed in a structured information environment. By combining the best from SGML and XML we will have a new working standard. This new combination should be the preferred platform for everyone who work with structured information, documents or data. The best XML has done for the information community is the awareness of structured information, and how important this is for the business case. Now it is time to sell SSGML (Simplified Standard Generalized Markup Language) :-) Btw: Charles Goldfarb did not invent SGML, he discovered it :-) Best Erlend ?verby -- Thinking is the hardest work there is, which is probably the reason why so few engage in it. (Henry Ford) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Fri Dec 3 18:08:58 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:19 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: Your message of "Thu, 02 Dec 1999 16:27:42 EST." <14406.58446.675568.388482@localhost.localdomain> Message-ID: <199912031808.LAA14313@localhost.localdomain> > I think that there is a growing need for a common C++ SAX 1.0 > interface as XML moves more and more into high-performance > environments. I have kept pointers that people sent to quite a few > existing attempts, but before I look those over, I'd like to try my > own off the top of my head. > > I'll be posting three follow-up messages on SAX/C++ to stimulate > discussion: > > 1. Some C++-specific SAX design principles. > 2. Implementation changes required or possible in C++. > 3. My first stab at a core SAX 1.0 C++ interface. > > I know that SAX2 is still being neglected, and I apologize. I would really like you to reconsider this ordering of priorities. SAX2 is urgently needed for DOM implementors and developers of XSLT engines with streaming output. You have done an admirable job of leading the SAX and SAX2 development, and it is dying without your output. Now if you were simply buried with 9-5 work, and couldn't lend your efforts, it would be understandable. But to detract from SAX2 in order to focus on SAX/C++, I think is muddling the priorities. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 18:14:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:19 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <199912031808.LAA14313@localhost.localdomain> References: <14406.58446.675568.388482@localhost.localdomain> <199912031808.LAA14313@localhost.localdomain> Message-ID: <14408.2087.817611.250771@localhost.localdomain> uche.ogbuji@fourthought.com writes: > I would really like you to reconsider this ordering of priorities. > SAX2 is urgently needed for DOM implementors and developers of XSLT > engines with streaming output. You have done an admirable job of > leading the SAX and SAX2 development, and it is dying without your > output. Now if you were simply buried with 9-5 work, and couldn't > lend your efforts, it would be understandable. But to detract from > SAX2 in order to focus on SAX/C++, I think is muddling the > priorities. What in SAX2 is most urgently needed for DOM and XSLT? I know that DOM level one *can* support some things that SAX doesn't report (such as comments and CDATA section boundaries), but there is nothing in DOM level one that says those have to be included, and I've heard of relatively few real-world applications that need that information. I'm not as familiar with the situation in XSLT, and information would be helpful. All the best, Daivd xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 18:14:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:19 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <199912031808.LAA14313@localhost.localdomain> References: <14406.58446.675568.388482@localhost.localdomain> <199912031808.LAA14313@localhost.localdomain> Message-ID: <14408.2102.136185.50050@localhost.localdomain> uche.ogbuji@fourthought.com writes: > I would really like you to reconsider this ordering of priorities. > SAX2 is urgently needed for DOM implementors and developers of XSLT > engines with streaming output. You have done an admirable job of > leading the SAX and SAX2 development, and it is dying without your > output. Now if you were simply buried with 9-5 work, and couldn't > lend your efforts, it would be understandable. But to detract from > SAX2 in order to focus on SAX/C++, I think is muddling the > priorities. What in SAX2 is most urgently needed for DOM and XSLT? I know that DOM level one *can* support some things that SAX doesn't report (such as comments and CDATA section boundaries), but there is nothing in DOM level one that says those have to be included, and I've heard of relatively few real-world applications that need that information. I'm not as familiar with the situation in XSLT, and information would be helpful. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Dec 3 18:18:25 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:19 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz> Message-ID: <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com> > > But a schema doesn't tell you the semantics (although certain schema > > languages might tell you how certain element types relate to others). > > I posted a mail early today entitled "Web Vision". It explains more > clearly what I have in mind. It is a relatively new idea but I feel that > it is possible to pack a lot of useful semantics into a schema. Are you achieving this by expressing how certain element types relate to other element types and to concepts? A semantic network? If so, you are still ultimately relating the elements to concepts you are probably going to define by human prose or running code. I'm not arguing with this idea. I think it probably has some promise. But the real semantics are ultimately introduced into the system by agreed to concepts that aren't expressed via schemata. A schema is part of the picture, but not the whole. I'll go back and read your Web Vision post. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Dec 3 18:20:08 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:20 2004 Subject: SGML the next big thing? References: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no> Message-ID: <015501bf3dbb$1de0be70$eb020a0a@bowstreet.com> > Btw: Charles Goldfarb did not invent SGML, he discovered it :-) Sounds like another Ted Nelson quote. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 18:22:56 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 Message-ID: <14408.2610.245842.199581@localhost.localdomain> I'd like to hear what others think on this issue. There was some interest in SAX2 when I posted my alpha interfaces a few months back (most notably, but not exclusively, from David Brownell), but it was hardly a tidal wave. On the other hand, I am noticing a building pressure from implementors to get something out in C++. I can think of a few reasons that the world might desperately be waiting for SAX2: 1. To get some kind of standard Namespace support (or at least a way to tell whether a parser has Namespace support built in). 2. To query parser features in general. 3. To get at the stuff that SAX 1.0 doesn't report, like comments, CDATA boundaries, and DTD declarations. I think that there is a real need for #1, since many other specs (XSL, XML Schema, RDF, XHTML, etc.) are built on top of Namespaces. I think that #2 would make life a fair bit easier for library developers, but it's not as critical (Simon St-Laurent will be grateful, though). I have a lot of trouble with #3, though. There are a few specialized fields where this stuff isn't just syntactic fluff (repositories and editing tools spring immediately to mind), but in general, very, very, very few real-world XML applications need to know about anything but elements, attributes, and character data -- witness the recent SML discussion. I'm very interested in hearing other opinions. Having a standard streaming interface stimulated a lot of development of reusable Java XML processing components, and I'd like to see the same thing happen in C++, but I need to hear what other people think the priorities should be. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 3 18:38:11 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:20 2004 Subject: SGML the next big thing? In-Reply-To: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no> Message-ID: <Pine.LNX.4.10.9912030138330.18460-100000@cauchy.clarkevans.com> On Fri, 3 Dec 1999, Erlend =D8verby wrote: > Btw: Charles Goldfarb did not invent SGML, he discovered it :-) I belive in this whole-heartedly. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 3 19:15:13 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:20 2004 Subject: FYI: YML: A Grand Unification of SAX and DOM? (fwd) Message-ID: <Pine.LNX.4.10.9912030214230.18460-100000@cauchy.clarkevans.com> Hello everyone, I'd like to carry on a sub-discussion of SML called YML on the SML list. It is of particular importance to the interaction between XML and XSL. Clark ---------- Forwarded message ---------- Date: Fri, 3 Dec 1999 02:13:25 -0500 (EST) From: Clark C. Evans <clark.evans@manhattanproject.com> To: sml-dev@eGroups.com Subject: YML: A Grand Unification of SAX and DOM? Paul, I didn't want to speak up a few days ago -- claiming that I was going to do a grand unfication of SAX and DOM (even though this was exactly what I was thinking may be the case). On Thu, 2 Dec 1999, Paul Tchistopolskii wrote: > > Isn't it the end of long discussion of Elements vs Attributes? > > Now when I see the question: "Should I use attributes or > > elements?" - I know the answer: > > > > "If you want it to be processed by current APIs not keeping > > the entire docuemnt in the memory - use attributes everywhere > > you can." Exactly... It is a matter of what type of access you would like to have when processing the information -- you have two choices: Random Access (DOM) or Sequential Access (SAX) The trick, however, is subtle. You don't want _all_ random access nor do you want _all_ sequential access. You want a ballance. And this is where the binary doubly recursive pattern comes in. On Thu, 2 Dec 1999, G. Ken Holman wrote: > I posit that expressing properties of hierarchical components as > sub-elements of ancestry does not work well in information design > for the following reasons: > > (1) - I claim that the information in <b> has no (and should have no) > direct relation to the information in <e>, but that the information in > attr1= may or may not have direct relation to the information in <e> > because of descendent scope (<e> is a descendant of <a> but not of <b> > so how could <b>'s "influence" be construed as impacting on <e>?) > > (2) - when I am processing <e> (say with XSLT and XPath) I can very > easily determine properties of <e> by examining ascendent places in > the hierarchy (<a> is an ascendant of <e> therefore <a> and its > attributes are easily addressed via the ancestor:: axis) > - if I didn't have attributes and I was obliged to use sub-elements, > the extra processing involved to examine all child elements of all > ancestors for possible applicable properties would be both unwieldy > and ambiguous This is a great summary. The last day I jotted down a rough sketch of the Idea I've had running thorugh my head the last few days. It is posted at http://clarkevans.com/yml.html With a text version below. It's *far* from perfect, but I wanted to get the idea out as a cohesive unit so we can work on it as a community. I'd like to hear what you all think... Clark --------------------------------- YML - The Why Markup Language ---------------------------------- Authors: Clark Evans History: Version .1, 03-DEC-1999 Summary YML is currently an assembly of thoughts regarding the creation of a doubly recursive markup language and parser description. YML is an extension of the simple markup language ("SML"), which is a strict subset of the extensible markup language ("XML"). Further, YML is a unification of the XML document object model ("DOM") and the simple application programming interface for XML ("SAX"). Motivation YML was motivated from two reoccurring debates on the XML list, under the titles "SAX vs DOM" and "element vs attribute". It is interesting how they are interwoven. The SAX vs DOM debate often centers around which is better for processing information: random access method (RAM) or a sequential access method (SAM). Those from the DOM camp state that having the entire document in memory makes things easy to program; while those from the SAX camp point to efficiencies of stream processing. The element vs attribute debate is concerned with the distinction between an element's content and its attributes. One camp believes that the difference reflects an clear contextual binary decomposition, while the other camp views the distinction as syntax sugar. These debates are subtly linked since SAX provides an accepted, de-facto interpretation -- included with sequential access for each element, is a random access collection of its attributes. This interpretation is of huge value, as it ties the element vs attribute debate to a more tangeable processing concern, sequential access or random access. SAX is of interest for one other reason. It does not notify the processor individually for each attribute -- instead, it waits until it has collected each and every attribute before providing them as a single collection. This is in sharp contrast to its treatment of elements, which are handled individually, one by one. Another Motivation: Transformation Languages The real value in XML isn't just data representation or ease of parsing, it is the promise of a transformation language expressed in XML itself. The XML style sheet language ("XSL") is one approach to markup language transformations. XSL is the composite of many wonderful constructs, lessened by a few particularly bad restrictions. The delightful recursive template matching system is XSL's claim to fame. XSL is a collection of such templates, where a match clause identifies an expression which will trigger particular elements (and not attributes) to be processed according to the rules provided. These 'xpath' expressions define multiple axis. The ancestor axis is the most important, it is the current element stack. Of secondary importance is the attribute axis, which allows access to an element's attributes. These axis together allow for a very powerful way to identify and process elements. One disturbing aspect of XSL is the inclusion of the forward and previous axes in this xpath expression syntax. Furthermore, loop constructs and the ability to re-visit elements was also included in the language. For an XSL processor to reasonably support these features, random access to the information is a requirement. This is a problem for large-size information sets or low-memory processing devices. There are a few individuals who are contemplating a stream based alternative to XSL which will work without these large memory restrictions. Assuming that SAX was the underlying basis for such a processor; the only items available in random-access memory at any given time would be an element stack, including each element's attributes. This is hardly good enough to be efficient. An extension to a minimal processor build on top of SAX, could enable those elements on the "previous" axis -- as long as they are mentioned somewhere in the stylesheet. A smart collector could identify an element which must be used later, pinning it for random access in the future. Unfortunately, this method would not work well with dynamically generated xpath expressions. There are many other concerns as well, such as how to accomplish sorting, repeat performances, and other clear benifits that the random access brings. However, so far, there has not been a clear approach. Direction The goal of YML is to be a building block upon which an alternative to XSL can be built. One which is more space efficient than XSL, yet one does not sacrifice time as a pure stream based alternative appears to do. To accomplish this, memory must be managed differently at the parser level; thus a new parser description ("PD") must be provided -- one that ballances the constraints of SAX with the power of DOM. And, to accomplish this, the syntax of the markup language ("ML") itself must be substantially altered. Strictly speaking, the ML could easily be an XML extension, however, the data model presented here would be too hard to grok with all of XML's subtleties. These are serious changes, however, if it is possible to unify SAX and DOM, and perhaps enable the generation of a better transformation processor, it may be worth it. Background Consider these included by reference: http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1120.html http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1136.html http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1205.html http://www.egroups.com/group/sml-dev/31.html http://www.egroups.com/group/sml-dev/89.html Development Consider an enhanced SAX parser with an element stack enabling the new XSL processor random access to the entire ancestor and attribute axis, with sequential access otherwise. Consider further: <root r="x" > <s1/> <s2/> </root> Here, both sequential access nodes s1 and s2 have random access to the node r. However, these nodes cannot access each other since they are provided sequentially: When the s1 is visited, s2 has not yet been provided. Also, when s2 is visited, s1 has already been dropped from memory. Note, that this is recursive: <root r="x"> <s sr="a"> <ss/> </s> </root> Here it is clear that the node ss can access node s, sr, and r. So far so good. Lets enumerate the possible node types: s, r, sr, ss, ssr, sss, sssr, ... Notice that given the current XML syntax, and this processing model, random access nodes with children are not allowed. In other words, a given xpath may only consist of sequentially accessed nodes followed by an optional, random access tail node. Meat It is shown below how a change in XML's syntax to permit recursion on the attribute axis would allow a parser to be built having random access nodes allowing children. It is hypothesized that this syntax change would allow a construction of a parser that could be used in lieu of both DOM and SAX, giving random access or sequential access in a context sensitive manner, as a function of the source information. It is further hypothesized that this parser could be used to build a processor that has most of XSL's advantages without sacrificing performance. The Change Consider the following syntax (due to John Cowan): <root <r <rr/> > <rs/> </r> > <s <sr/> > <ss/> </s> </root> With this change, it is possible to generate all of the possible node types: r s rr rs sr ss rrr rrs rsr rss srr srs ssr sss ... This may not be the prettiest syntax; however, XML becomes a sub-set of this new syntax -- where the following definition is used for backwards compatability. <el att="val" /> <=> <el <att>val</att> /> And perhaps allowing the following syntax sugar is used for nested attributes (due to Colas Nahaboo): <el att=<ch>val</ch> /> <=> <el <att <ch>val</ch>/>/> Further, there should be no problem for XML parsers to enable the recognition of the new syntax since the above are since neither of the above expressions are well-formed. Thus, this is the basis for a completely different parser behavior that alternates between random access or serial access depending upon the type of node which is encountered, according to something like the following: interface yml-node { boolean is-random(); } interface yml-branch extends yml-node { String name(); } interface yml-leaf extends yml-node { String text(); } interface yml-stack { yml-node current(); yml-stack parent(); // list of random children int count(); yml-stack child(int i); // private yml-stack(yml-stack parent, yml-node current); add( yml-stack child ); complete(); } interface yml-output { void push( yml-stack element ); void leaf( yml-stack element ); void pop( yml-stack element ); } interface yml-input { // to be defined } void yml-process( yml-input in, yml-output out, yml-stack stack, boolean is-random ) { if(in.peek-is-leaf()) { yml-stack top = new yml-stack( stack , new yml-leaf(is-random,in.next()); if(is-random) stack.add(top); else out.leaf(top); return; } // it's a branch yml-stack top = new yml-stack( stack, new yml-branch(is-random, in.next()); while(in.inside-the-tag()) yml-process(in,out,top,true); top.complete(); if(!is-random) out.push(top); while(in.outside-the-tag()) yml-process(in,out,top,false); if(is-random) stack.add(top); else out.pop(top); return; } Thus, if the entire output uses the "random recursion" extreme, with only a single node (the top one) being sequential, then this method looks very much like DOM. In the other extreme, if "sequential recursion" is used, with an occasional attribute, then this method looks very much like SAX. However, if the input stream is a unique mixture then the result is surprising: the parser configures its memory usage subject to the structure of the information being processed. Thus, a unified parser is created. For an transformation processor built on top of this type of parser, it motivates an additional 'random' axis. Define a sequential node as one visited by the procesor's interface, and is dropped from memory afterwords. Define a random node as one not visited by the processor's interface, but made available through a random access method. Access of random nodes by sequential nodes is provided by the following rules: (a) Sequential nodes may reference its or any of its sequential ancestors's random siblings. (b) If a sequential node may reference a random node, then it may also reference any random children of the random node. Furthermore, if XML syntax compliance is absolutely needed -- the attribution notion could be used to mark random nodes: <root> <r random-access="yes"> <rr random-access="yes"/> <rs/> <r> <s> <sr random-access="yes" /> <ss/> </s> </root> Alternatively, the distinction between sequential or random nodes could be detailed in a DTD or some other schema document. However, I feel all of these are kluges and that John Cowan's syntax is the best expression of the idea. So the syntax becomes a bit more complicated... maybye. I say make the lower level a bit more complicated... to simplify everything else. It may be a ways off, however, I believe that this binary recursive method provides an novel and unexpected approach to making information processors more efficient. Best Wishes, Clark Evans Credits Too many to mention, first the xml-dev and sgml-dev and xsl lists; filled with smart people. Second, to Dan Palanza for introducing me to binary recursive models. Further, to the huge amount of philosophical, and technical literature out there regarding programming and systems theory that has shaped the manner in which I approach problems. Thanks! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Dec 3 19:12:37 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <14408.2610.245842.199581@localhost.localdomain> Message-ID: <199912031912.OAA10962@hesketh.net> At 01:21 PM 12/3/99 -0500, David Megginson wrote: >I'd like to hear what others think on this issue. There was some >interest in SAX2 when I posted my alpha interfaces a few months back >(most notably, but not exclusively, from David Brownell), but it was >hardly a tidal wave. On the other hand, I am noticing a building >pressure from implementors to get something out in C++. As interested as I am in seeing SAX2 emerge (see below), I'll admit that getting SAX out in C++ is probably more immediately important. I avoid C++ and C completely, but I get lots of queries about XML for C/C++/COM other than IE 5. >I can think of a few reasons that the world might desperately be >waiting for SAX2: > >1. To get some kind of standard Namespace support (or at least a way > to tell whether a parser has Namespace support built in). > >2. To query parser features in general. > >3. To get at the stuff that SAX 1.0 doesn't report, like comments, > CDATA boundaries, and DTD declarations. I think #1 is very important, but #2 makes both #1 and #3 much easier. Those who neither need nor want namespaces may still want other features, and need to make the queries, and once that query process is in place it's easy to define numbers 1 and 3 as 'optional parser features'. I was pretty pleased with the SAX2 Alpha, and think it may provide enough of #2 that maybe #1 and #3 could be carried out as separate (but affiliated) efforts. Now that I'm nearly done refinishing my floors, I'm hoping to have more time to devote to proposals like this again. But first I have to go to Philadelphia for that XML '99 thing. I'd love to talk with anyone who's interested about SAX futures there. Meet at the bar, any bar I guess. Simon St.Laurent XML: A Primer, 2nd Ed. Building XML Applications Inside XML DTDs: Scientific and Technical Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Fri Dec 3 19:17:24 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:20 2004 Subject: SGML the next big thing? Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415549@THOR> Erlend Xverby [Erlend.Overby@usit.uio.no] wrote >>What we don't need from the SGML standard: >> - SGML Declaration >> - Character entities >> - Minimisation >> - The "&" construct It looks like the XML Schema group is trying to add back the & construct. If you have a compelling justification for continued suppression, please rant long and loud. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Dec 3 19:31:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 Message-ID: <3.0.32.19991203112817.014cd7b0@pop.intergate.ca> At 01:21 PM 12/3/99 -0500, David Megginson wrote: >1. To get some kind of standard Namespace support (or at least a way > to tell whether a parser has Namespace support built in). >2. To query parser features in general. >3. To get at the stuff that SAX 1.0 doesn't report, like comments, > CDATA boundaries, and DTD declarations. > >I think that there is a real need for #1 >I think that #2 would make life a fair bit easier for library developers >I have a lot of trouble with #3 Agreed, on all points. The unavailability of namespaces threatens to make SAX unusable before too long. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Fri Dec 3 19:34:29 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:18:20 2004 Subject: SGML the next big thing? In-Reply-To: <61DAD58E8F4ED211AC8400A0C9B46873415549@THOR> Message-ID: <199912031930.LAA00641@mail.sqwest.bc.ca> On 3 Dec 99, at 12:14, Arnold, Curt wrote: > It looks like the XML Schema group is trying to add back the & construct. > If you have a compelling justification for continued suppression, please > rant long and loud. How about every SGML parser author I've talked to says the & construct was the biggest, hardest part (which means probably the buggiest) of the entire parser? I think the XML WG was right in throwing it out of XML in the first place. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Fri Dec 3 19:57:27 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <3.0.32.19991203112817.014cd7b0@pop.intergate.ca> Message-ID: <199912031953.LAA21889@mail.sqwest.bc.ca> On 3 Dec 99, at 11:32, Tim Bray wrote: > At 01:21 PM 12/3/99 -0500, David Megginson wrote: > >1. To get some kind of standard Namespace support (or at least a way > > to tell whether a parser has Namespace support built in). > >2. To query parser features in general. > >3. To get at the stuff that SAX 1.0 doesn't report, like comments, > > CDATA boundaries, and DTD declarations. > > > >I think that there is a real need for #1 > >I think that #2 would make life a fair bit easier for library developers > >I have a lot of trouble with #3 > > Agreed, on all points. The unavailability of namespaces threatens > to make SAX unusable before too long. -Tim I think SAX availability of namespaces would be useful; the DOM Level 2 (soon to be a Candidate Recommendation, which means "please implement and tell us whether it's possible") has namespace support and the proliferation of SAX to DOM builders means it would be good if SAX and DOM could support more of what the other needs. I have mixed feelings about CDATA sections; they're useful for things like writing scripts that are embedded in XML documents, so I'd rather have them available, but I can see that not every application needs them. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Fri Dec 3 20:13:55 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:18:20 2004 Subject: Request for Discussion: SAX 1.0 in C++ References: <14406.58446.675568.388482@localhost.localdomain><199912031808.LAA14313@localhost.localdomain> <14408.2087.817611.250771@localhost.localdomain> Message-ID: <006501bf3dca$a4767ce0$4a5eedc1@arp01> > What in SAX2 is most urgently needed for DOM and XSLT? I know that > DOM level one *can* support some things that SAX doesn't report (such > as comments and CDATA section boundaries), but there is nothing in DOM > level one that says those have to be included, and I've heard of > relatively few real-world applications that need that information. How about focusing on SAX/2, and making the first C/C++ SAX interface actually SAX 2 so we kill two birds with one stone ? Regards, Richard. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vlashua at RSGsystems.com Fri Dec 3 20:13:42 1999 From: vlashua at RSGsystems.com (Vane Lashua) Date: Mon Jun 7 17:18:20 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D77@MAILPROD> What's the point of defining "Point"? You are putting it in a context for a processing engine to process. "Point" is meaningless by itself -- even though it may be syntactly correct, in a context, with normalized attributes and values -- without a specific processor that understands what a "Point" is. Let's say you want to type a value in XML. Easy. ...type="int" value="1"/> . Or ...type="2Dpoint" value="2,2"/> (whether I say 2Dxvalue="2" 2Dyvalue="2" is of no consequence). Both "int" and "2Dpoint" are defined somewhere else. A Java or COBOL processor may be made to understand the type "2Dpoint", but XML never will. It is not a processor any more than valid Java source code is. Vane -----Original Message----- From: Colas Nahaboo [mailto:Colas.Nahaboo@sophia.inria.fr] Sent: Friday, December 03, 1999 12:11 PM Vane Lashua writes: > I think you're mixing apples and oranges. I see it the other way: I try to make people realize that they are the same, and the current artifical limits in the XML syntax make people stuble on artificial syntax problems. > An even simpler declaration of your example below -- and correct in XML -- > would be: > <Point value="12in,2cm;RFFx,G0,B0" /> This is not XML. You invent a sub-language to describe the contents of the value attribute. You will then need XML and a XML parser to understand the outer XML, and you will have to invent and specify the inner language, and design and implement the parser, which is *more* complex than an unified "XML 2" language. (note that SVG did just that with the contents of the path element :-). People tend to invent plenty of these sub-languages and mentally hide them under the rug, failing to see that the did not simplified anything, just made things more complex at more places in many different - and often unspecified - ways. > XML is a storage medium. Java source code is a storage medium. XML may > contain Java source code syntax, as Java source code may contain XML syntax, > but both need processors to do more. Yep, but if you look at my example, you could see that I got rid of any sub-language!!! I only need an XML parser, nothing else at the parsing level. I still need the upper semantic level, of course, but at least I dont have to have plenty of different lexical parsers (and specs) to describe my data. The SVG example is striking. To implement a SVG viewer, you need to have an XML parser, plenty of other parsers to parse the sub-languages invented in the different attributes and contents of the SVG XML, including a full CSS and HTML parser... Note that I descibed only the object instances, NOT the classes structures (this belongs to schemas, not to the XML level), and they are not java, they could represent C++, common lisp, python,... objects! -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml-dev at teleo.net Fri Dec 3 20:20:28 1999 From: xml-dev at teleo.net (Patrick Phalen) Date: Mon Jun 7 17:18:20 2004 Subject: Rocket framework for creating Web sites In-Reply-To: <000001bf3dad$a0c0f520$0b00a8c0@grissom> References: <000001bf3dad$a0c0f520$0b00a8c0@grissom> Message-ID: <99120312225003.00844@quadra.teleo.net> [KenNorth, on Fri, 03 Dec 1999] :: Michael Floyd just released the Rocket framework: :: :: "In a nutshell, Rocket is a collection of skeleton XML documents, XSL style :: sheets, and DTD's that you can use as a basis for creating your own :: XML-based Web site. Using Rocket, you can transform XML documents and serve :: them to any browser, regardless of its capabilities. Rocket also allows you :: exchange XML streams between XML-capable browsers and HTTP servers." :: :: Check his BeyondHTML site: :: http://www.beyondhtml.com/rocket/rocket.xml Can someone explain how this would be an improvement over Cocoon? http://xml.apache.org/cocoon/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Fri Dec 3 20:41:38 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:20 2004 Subject: expat meant to be restartable? Message-ID: <38482AC7.95973198@fxtech.com> I don't think it is, but I want to check. I'd like to be able to reuse an XML_Parser after I've called XML_Parse with isFinal set to 1. Basically I want to go back and parse a subset of the original file, using modified starting buffer pointer and length, but it doesn't seem to work (I get a JUNK_AFTER_DOC_ELEMENT error). I would like to avoid creating a new parser for each element subtree I scan. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Fri Dec 3 20:50:07 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:20 2004 Subject: simple DOM-style XML parser, in C++? Message-ID: <38482CCC.756549B3@fxtech.com> I just "discovered" this list, and I wanted to check into the existence of a simple parser that provides DOM-like access, before I implemented my own (on top of expat). I've heard of SAX, and wonder if the C++ interface does anything like what I want to do. I'm looking for a simple query interface that provides generic access to elements and attributes, possibly with iterator-style access. Here is an example XML file, and how I would like to go about parsing it: <Container name="foo" type="bar"> <Foo name="element" length="42"/> <Object name="object1"> <SubObject/> </Object> <Object name="object2"> <SubObject/> </Object> </Container> // open file (somehow) XML::File file(filename); // search for a top-level element XML::Element element = file.GetElement("Container"); // query attributes XML::Attribute nameAttr = element.GetAttribute("name"); XML::Attribute typeAttr = element.GetAttribute("type"); // get attribute values std::string name, type; nameAttr >> name; typeAttr >> type; // look for specific sub-element XML::Element fooElem = element.GetElement("Foo"); // read attributes directly fooElem.GetAttribute("name") >> name; int length; fooElem.GetAttribute("length") >> length; // loop over elements by iterator XML::element_iterator it = element.begin("Object"); while (it != element.end()) { XML_Element &objElem = (*it); objElem.GetAttribute("name") >> name; // etc ... } I think something like this is pretty straightforward without all that DOM complexity. I think this interface can be layered on top of expat. Does the C++ SAX interface already do something like this? *should* I be using DOM instead for something this simple? -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mbrady at nist.gov Fri Dec 3 21:07:27 1999 From: mbrady at nist.gov (Mary Brady) Date: Mon Jun 7 17:18:20 2004 Subject: DOM ECMAScript Test Suite References: <199912031953.LAA21889@mail.sqwest.bc.ca> Message-ID: <015501bf3dd2$ec275390$293b0681@ncsl.nist.gov> Hi Everyone, I've just updated our DOM ECMAScript test suite, available from http://www.nist.gov/xml/ Click on DOM Test Suite. This suite includes ~900 ECMAScript tests that exercise the DOM Level 1 Fundamental, Extended, and HTML interfaces. You can view the results using IE5 by clicking on first the category, and then the particular interface. Options are available for displaying the source code, semantic requirements (which are simply axioms we glean from the spec to organize our thoughts), and the actual specification. Please let me know if you find this useful. We are in the process of generating equivalent functionality for the java binding. We are just about finished with the fundamental interfaces, and expect to have a first set, including fundamental and extended available in early January. As always, comments/suggestions are greatly appreciated. Mary Brady NIST, Conformance Testing mbrady@nist.gov xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 3 21:11:59 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <14408.2610.245842.199581@localhost.localdomain> Message-ID: <000e01bf3dd3$15c21d20$099918d1@docuverse1> David, I believe language specific SAX bindings can and should be delegated to others with you and rest of us keeping an eye on its progress. [btw, I am busy :] Just form a small group (2-3 people) out of the people who are applying pressure on you about SAX/C++, throw in a C++ guru for flavor, and stir. XML-DEV can serve as the sounding board for this and other languages (Smalltalk?). SAX2 work, on the other hand, is far more important IMHO. Lets get the namespace support added and revisit the parser features and missing callback issues. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Bruce.Duffy at westgroup.com Fri Dec 3 21:53:34 1999 From: Bruce.Duffy at westgroup.com (Duffy, Bruce) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 Message-ID: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com> I agree with Tim Bray. I'll have to walk away from SAX if it doesn't support namespaces in the near future. I'm concerned that SAX will lose its relevance (vendors will cease to support it) if it doesn't track the relevant xml standards. Bruce Duffy -----Original Message----- From: David Megginson [mailto:david@megginson.com] Sent: Friday, December 03, 1999 12:22 PM To: XMLDev list Subject: SAX/C++ vs. SAX2 I'd like to hear what others think on this issue. There was some interest in SAX2 when I posted my alpha interfaces a few months back (most notably, but not exclusively, from David Brownell), but it was hardly a tidal wave. On the other hand, I am noticing a building pressure from implementors to get something out in C++. I can think of a few reasons that the world might desperately be waiting for SAX2: 1. To get some kind of standard Namespace support (or at least a way to tell whether a parser has Namespace support built in). 2. To query parser features in general. 3. To get at the stuff that SAX 1.0 doesn't report, like comments, CDATA boundaries, and DTD declarations. I think that there is a real need for #1, since many other specs (XSL, XML Schema, RDF, XHTML, etc.) are built on top of Namespaces. I think that #2 would make life a fair bit easier for library developers, but it's not as critical (Simon St-Laurent will be grateful, though). I have a lot of trouble with #3, though. There are a few specialized fields where this stuff isn't just syntactic fluff (repositories and editing tools spring immediately to mind), but in general, very, very, very few real-world XML applications need to know about anything but elements, attributes, and character data -- witness the recent SML discussion. I'm very interested in hearing other opinions. Having a standard streaming interface stimulated a lot of development of reusable Java XML processing components, and I'd like to see the same thing happen in C++, but I need to hear what other people think the priorities should be. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From KenNorth at email.msn.com Fri Dec 3 21:54:36 1999 From: KenNorth at email.msn.com (KenNorth) Date: Mon Jun 7 17:18:20 2004 Subject: Rocket framework for creating Web sites References: <000001bf3dad$a0c0f520$0b00a8c0@grissom> <99120312225003.00844@quadra.teleo.net> Message-ID: <022701bf3e3d$7ba1ed40$0b00a8c0@grissom> From: Patrick Phalen <xml-dev@teleo.net> > :: Michael Floyd just released the Rocket framework: > > Can someone explain how this would be an improvement over Cocoon? > http://xml.apache.org/cocoon/ Patrick, When did choice become a dirty word? There appears to be an obvious difference. Rocket currently works with ASP, with plans for supporting Java servlets and Perl/CGI. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 3 22:31:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:20 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com> References: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com> Message-ID: <14408.17522.150496.586573@localhost.localdomain> Duffy, Bruce writes: > I agree with Tim Bray. > > I'll have to walk away from SAX if it doesn't > support namespaces in the near future. > > I'm concerned that SAX will lose its relevance > (vendors will cease to support it) if it doesn't > track the relevant xml standards. There's no problem adding Namespace support to SAX -- after all, you can stack SAX filters on top of each other. John Cowan wrote a Namespace filter for SAX about a year ago, and I have a fairly high-performance one that I just haven't had time to package and release yet. The problem is the lack of a standard way to tell whether a SAX driver already supports Namespace processing natively (and, thus, doesn't need a filter), together with a standard way to turn that processing on or off. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Fri Dec 3 22:54:36 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:18:20 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: <38478DB2.FACA4633@praxis.cz> Message-ID: <Pine.GHP.4.21.9912032153220.11243-100000@mail.ilrt.bris.ac.uk> On Fri, 3 Dec 1999, Matthew Gertner wrote: > David Megginson wrote: > > How does the schema tell me that foo represents a container for a > > collection of objects, bar represents an object, and hack and flurb > > represent the object's properties? > > The point is not what the current schema draft allows, it is whether it > would be feasible and appropriate to represent this information in XML > schemas, as Paul rightly stated. My opinion is that it would be fairly > trivial and extremely useful. I believe it will be possible to annotate XML schemas with information for mapping into (generic or domain specific) application datamodels such as RDF. I don't think it is right to expect the hard-pressed XML Schema group to define all these mappings within that working group. But that doesn't matter; all we need is a placeholder for such information. My understanding of the Cambridge Communique meeting was that we reached agreement on just this. See points 1-6 under '3. Observations and Recommendations' in http://www.w3.org/TR/1999/NOTE-schema-arch-19991007 If it _is_ really trivial to define a mapping from XML Schema information to a classes/objects/properties RDFesque model, I for one would like to see this documented and implemented. XML-DEV seems as good a place as any to play around with such a thing... Excerpt from the Cambridge Communique: (I've no idea where the XML Schema WG's work is up to in relation to these ideas; the basic principles outlined here seem enough to get discussion going on XML-DEV though) [from http://www.w3.org/TR/1999/NOTE-schema-arch-19991007] 3. Observations and Recommendations This group reached consensus on the following observations and recommendations: The XML data model is the XML Information Set being specified by the XML Information Set Working Group. Other data models exist, both generic and application-specific. RDF is an example of one such generic data model. The XML Schema and RDF Schema languages are separate languages based on different data models and do not need to be merged into a single comprehensive language. An XML Schema schema document will be able to hold declarations for validating instance documents. It should also be able to hold declarations for mapping from instance document XML infosets to application-oriented data structures. For evolvability and interoperability, the XML Schema specification should provide an extension mechanism allowing for the augmentation of XML Schema schemas with additional material. At a minimum, XML Schema should permit elements from other namespaces to be included in schema documents. This extension mechanism should also permit individual extensions to be marked 'mandatory', meaning that a document instance cannot be deemed 'schema valid' if the processing required by a marked extension cannot be performed. The extension mechanism should be appropriate for use to incorporate declarations ("mapping declarations") to aid the construction of application-oriented data structures (e.g. ones implementing the RDF model) as part of the schema-validation and XML infoset construction process. This facility should not be exclusive to RDF, but should also be useable to guide the construction of data structures conforming to other data models, e.g. UML. [...] > > It can be. The DOM represents a domain-specific object layer that is > > useful for a wide subset of XML operations (especially document- and > > browser-oriented work). There need to be many layers on top of XML, > > one for each domain -- it happens that many of those layers will share > > the need to encode objects, so a standard object layer sandwiched > > between XML and the domain-specific layers can save a lot of work. > > Sure, the DOM has value. My point is that maybe 95% of applications want > a domain-specific rather than a generic interface. My other point is > that a domain-specific interface can be implemented generically; i.e. > programmatic interfaces for accessing XML data can be generated > automatically from XML schemas. This isn't *that* far from what MDSAX is > doing. IBM's XML BeanMaker (http://alphaworks.ibm.com/tech/xmlbeanmaker) > is a good example of this concept. > > > > There are a variety of efforts to create > > > domain-specific objects automatically from XML objects. I don't have a > > > list at the tips of my fingers, but if anyone does it would be a great > > > resource. They are out there because I keep bumping into them. > > > > One example is RDF. > > So we are talking about different things. RDF is a formalism but it > doesn't provide you with any code (although I'm sure that tools for this > could be written, and perhaps already have been). I am talking about > something that will take my schema with Customer and Invoice element > types and turn it into, say, Java classes called Customer and Invoice. Sure, you could do this. My hunch is that the urge to do this won't be as strong when we have more abstract (objects and properties) interfaces to XML content, rather than our current APIs that obsess on detail of particular serialisations rather than on what those serialisations have told us about the objects. If we could get to a world where generic rather than domain interfaces being useful to even 10% instead of 5% of applications (to borrow your figure), that'd be a huge win. > > I disagree strongly with the last part of that statement. I'd argue > > the opposite -- higher-level layers should be as independent of XML as > > possible. That's the only way to build good, layered architectures. > > XML does one thing (represent a tree structure in a character stream) > > very well: it's an excellent layer to build other layers on top of, > > but XML itself should stay as simple as possible so that it's > > applicable widely to many different fields. > > I agree with the layering approach. But well-formed XML should be viewed > as the lowest level (representing tree structures); when bound to an XML > schema it then becomes a serialized object representation. There is also a need to know the objects'n'properties view of the data without going to fetch (or having advance knowledge of) the syntactic schema or serialisation policy. RDF's initial syntax was one approach; there have been and will be others. The Microsoft folks were for a while throwing around some interesting ideas on mapping more 'colloquial' XML syntax into directed labelled graphs. There's a version at http://www.biztalk.org/Resources/canonical.asp for example. > > That would be another serious mistake. Object exchange, while > > important, represents only one of many layers that can be build on top > > of XML, and if XML Schemas start trying to solve high-level problems > > for every specific domain, it will become an unimplementable mess. > > RDF already made a similar mistake by mixing together a spec for > > object encoding in XML with a spec for representing knowledge about > > Web pages. > > Maybe this is the crux of our disagreement. I see object exchange as > *the* application for valid XML. I've also heard that some folks want to use it for structured hypertext documents... (One consequence of XML's document heritage is that document order is generally treated as meaningful and in need of preservation. This can be a pain in the butt for data-centric apps.) I'd be interested to hear some examples > of applications that cannot be cast effectively in this light. In this > view, RDF and XML Schemas are coming at the same problem from different > angles. RDF is saying essentially "how do we build an XML application > that represents object structures", This is one aspect of what RDF attempts, ie the syntax component. The initial RDF Syntax is saying 'how do we build an XML application to represent a particularly Webbish flavour of object structures? (ie. directed labelled graphs with web identifiers for nodes, node types, relation/property types). RDF in general doesn't look for one way of stuffing RDF data graphs into XML; there are bound to be many ways of shipping these kinds of object structures around in angle brackets. So... the upper levels of RDF (model and schema) *don't* care how about the way in which we "build an XML application that represents object structures". while XML Schemas are saying "how do > we enhance DTDs by adding some object-oriented facilities". My fear is > that these two approaches are going to meet somewhere in the middle and > turn out to be the same thing. If so, I vastly prefer the use of XML > schemas. Why? Because this results in a vast simplication of the whole > XML picture. Isn't it better to take a normal XML instance, using base > XML syntax, and "turn" it into an object by adding the appropriate > information in a separate schema, rather than having to recast the whole > thing in a different syntax? I don't see a conflict here. RDF is happy with multiple ways of shipping data around; what it cares about is having a unified model for this heterogenous data. Nobody I've met ever expected all interesting RDF applications to use RDF 1.0 Syntax. > (I wonder if I am expressing this idea clearly. I'll happily post an > example of how this could be done if I'm not.) I'd love to see examples of an annotated XML Schema that shows how to derrive an objects'n'properties view of instance data. Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Fri Dec 3 23:20:20 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:20 2004 Subject: dateTime Schema counter proposal Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415555@THOR> This message is my starting suggestion for an overhaul of the date and time related data types in the Sept working draft of the XML Schema Datatypes document (http://www.w3.org/TR/xmlschema-2/). I hope this can engender some discussion on xml-dev and we can then can donate something that can be incorporated quickly into the XML Schema working groups process (hopefully in time for the next draft). I'm am not a W3C member so this is not an officially endorsed W3C effort. My problems with the current draft: 1. Acceptance of ISO 8601 truncated forms without a mechanism to disallow them makes mapping to existing programming language date and time types problematic and engenders a lot of complexity. The use of truncated forms seems to oppose the XML design criteria of not adding complexity to gain terseness. For example, --15-04 is supposed to represent April 15th of every year (or of an unspecified year). There is no mechanism to fit that into a datatype that represents an absolute date (such as VT_VARIANT or COleDateTime or corresponding Java datatypes/classes). 2. Use of hh:mm:ss form for time durations makes it difficult to represent intervals like 120 days. Also, there is no provision for disallowing non-exact durations (such as durations including years or months terms that cannot be unambiguously converted to seconds) 3. There is no mechanism for requiring or disallowing time zone qualifiers. Okay, here goes Note: in all the patterns below when I used the + sign, it indicates either a + or - appearing in its position. remove timeInstant and recurringTimeInstance. x.x.x. date Represents an particular day of a particular calendar year at an explicitly stated or implied time zone. x.x.x.x lexicalRepresentation A single lexical representation, which is a subset of the lexical representations allowed by ISO 8601, is allowed for date. This lexical representation is the ISO 8601 extended format with optional time zone specifier: CCYY-MM-DD[ Z | +hh:mm]. The presence of the time zone qualifier is controlled using the timeZone facet. Examples: 1999-12-04Z 1999-12-04-06:00 1999-12-04 x.x.x time Represents a specific instant in a unspecified day. x.x.x.x Lexical Representation A single lexical representation, which is a subset of the lexical representations allowed by ISO 8601, is allowed for timeDuration. This lexical representation is the ISO 8601 extended format Thh:mm[:ss[.sss]][Z | +hh:mm] p.s. Examples: T00:15Z T12:30:00+05:00 T13:00 x.x.x dateTime A particular instant on a particular date in an particular calendar year. x.x.x.x Lexical Representation CCYY-MM-DDThh:mm[:ss[.sss]][ Z | +hh:mm] Examples: 1999-12-04T15:03 1999-12-05T15:03+05:00 1999-12-05T15:03:15.123+05:00 x.x.x timeDuration A time duration is a defined length of time, such as 12 hours. x.x.x.x Lexical Representation There are two allowable lexical representation of timeDuration, one is consistent with ISO 8601 section 5.5.3.2 P[nY][nM][nD][nH][nM], the second is lexical representation of a real datatype interpreted as duration in seconds. Examples: P6W : Six weeks (this is, 42 days) P12H30M : Twelve hours 30 minutes 45000 - 45000 seconds 1e-6 - one microsecond Note: This ISO form was chosen since the alternative representation has difficultly representing durations such as 120 days. x.x.x timeZone Indicates a specific offset from Universal Coordinated Time. x.x.x.x Lexical Representation Two forms, the first, Z indicates no offset from UTC, the second is +hh:mm. Constraining facets x.x.x timeZone facet The time zone facet constrains the appearance of a time zone specifier and qualifies date, time and dateTime datatypes. Schema for timeZone facet: I expressed default as an attribute so it could be distinguished from the default element and could be typed as timeZone. <element name='timeZone'> <archetype> <attribute name='minOccurs'> <datatype name="integer"/> <default>0</default> </attribute> <attribute name='maxOccurs'> <datatype name="integer"/> <default>1</default> </attribute> <attribute name="default"> <datatype name="timeZone"/> </attribute> <attribute name="fixed"> <datatype name="boolean"/> </attribute> </archetype> </element> Examples of use <datatype name="zonedDateTime"> <basetype name="dateTime"/> <!-- time zone must appear --> <timeZone minOccurs="1"/> </datetype> <datatype name="CSTdefaultTime"> <basetype name="time"/> <!-- if time zone is not specified, it is implied to be CST --> <timeZone default="-06:00"/> </datatype> <datatype name="localTime"> <basetype name="time"/> <!-- time zone may not appear --> <timeZone maxOccurs="0"/> </datatype> <datatype name="CSTDate"> <basetype name="date"/> <!-- date can either be 1999-12-04 or 1999-12-04-06:00, but no other time zone -> <timeZone default="-06:00" fixed="true"/> </datatype> x.x.x precise facet The precise facet qualifies timeDuration and disallows the use of year and month terms (which cannot be unambiguously converted to a duration in seconds). Example: <datatype name="preciseDuration"> <basetype name="timeDuration"/> <precise/> </datatype> Note: comparisions should not be done of time, dateTime or date datatypes unless the time zone is explicit or implied by the timeZone facet. Note: here are some ways of representing some of the ISO 8601 functionality that we lost, by using multiple attributes (or elements) to hold multiple pieces of information. Recurring date: <Session> <!-- recurring day (one specific Friday) and a recurrence frequency -> <RecurringDay day="1999-12-04" repeats="P1W"/> </Session> <TaxFilingDeadline dayTime="1999-04-15T24:00" repeats="P1Y"/> Time Period <TimePeriod start="1999-12-06T09:00-04:00" end="1999-12-06T16:00-04:00"/> <TimePeriod start="1999-12-06T09:00-04:00" duration="25200"/> <TimePeriod start="1999-12-06T09:00-04:00" duration="P7H"/> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From yiminz at timberline.com Fri Dec 3 23:20:33 1999 From: yiminz at timberline.com (yimin zhu) Date: Mon Jun 7 17:18:20 2004 Subject: schema mapping tool Message-ID: <2D722CFF0999D111AB860001FA375F1004353D21@laposte.timberline.com> Does anybody know if there is a tool for mapping two XML schemas? Yimin Zhu Research & Development Timberline Software Corp. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Sat Dec 4 04:23:29 1999 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:18:20 2004 Subject: SGML the next big thing? In-Reply-To: <199912031930.LAA00641@mail.sqwest.bc.ca> Message-ID: <Pine.BSI.3.96r.991203230938.7036B-100000@shell1.interlog.com> On Fri, 3 Dec 1999, Lauren Wood wrote: > On 3 Dec 99, at 12:14, Arnold, Curt wrote: >> It looks like the XML Schema group is trying to add back the & construct. >> If you have a compelling justification for continued suppression, please >> rant long and loud. > > How about every SGML parser author I've talked to says the & > construct was the biggest, hardest part (which means probably the > buggiest) of the entire parser? I think the XML WG was right in > throwing it out of XML in the first place. If this is as per content models, I think (1) Lauren is right, because as SGML specified them, they were very hard to get right. This & thing is so far outside the way most other computer languages work that standard off-the-shelf parser generators roll on their backs and wave their paws in the air and admit defeat. (2) The idea of saying, "this element must contain at least one of each of the following elements" is a useful one, and is very different from the & construct. A simplified, regularised form of & might be possible. (3) The & connector interacts with #PCDATA to form pernicious content models (see below). The XML WG went to great lengths to make sure that no valid XML document suffers from this SGML bogosity. Similar lengths are needed for "&". Note: For those who're not familiar with &, the content model connector in SGML that says that in order to match a & b & c ..., every content fragment a, b, etc., must be satsfied, and nothing must be left over. Furthermore, there must be exactly one way to satisfy the expression, as otherwise it is "ambigious" and illegal, just as (a, b?) | a is illegal in SGML, even though it is a perfectly sensible and valid regular expression for the rest of the world of computing :-) Consider the following SGML declaration (with OMITTAG NO): <!ELEMENT boy (noise & (dirt,mud)+ & (mud,shoes,trouble)* & #PCDATA) +smell > This is a "pernicious" mixed content model, and can only have white space in it between elements once, since that uses up the #PCDATA content model fragment. The following is (let's say for the sake of argument) a valid boy: mud,smell,shoes,trouble,dirt,mud,dirt,mud,noise,smell If you try and match this against the content model I gave, you'll see that you can't do it with LL(1) or LALR(1) directly unless you build a DFA with a rather large number of states. I added the inclusion +smell, but you could change the content model to be (boy-model | smell)* to have an even more interesting time of it. -- Liam Quin, Barefoot Computing, Toronto; The barefoot agitator l i a m at h o l o w e b dot n e t <-- NEW ADDRESS Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/ Please remove your shoes and socks before replying in anger. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sat Dec 4 05:06:03 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:18:20 2004 Subject: XML processing instruction survey References: <3.0.32.19991130185652.01516ec0@pop.intergate.ca> Message-ID: <3848A063.863A20A3@infinet.com> Tim Bray wrote: > At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote: > >I'm interested in the extent to which people are actually using the XML > >processing instruction ( <?xml ) in their XML files, and the extent to which > >they find it useful. > > It's not really designed for people. It's mostly designed for use > by the XML processor to help figure out the encoding and make sure that > this is really XML. > > I'd think that using it at the application level would be not only > uncommon but probably unwise. I'd be interested to hear any positive > responses to the query. -T. I use processing instructions for the class name (Java specific here) of the application object that is used to handle a particular document type. In this sense the PI acts as a stream header. The PI in a sense is not document content but is only used as an identifier as to what module should be dynamically loaded to handle the document data. I like to think of PI's as things which are useful for commanding your particular application as to what it should do with the data and not something inherent within the document itself. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From twleung at sauria.com Sat Dec 4 05:17:40 1999 From: twleung at sauria.com (twleung@sauria.com) Date: Mon Jun 7 17:18:20 2004 Subject: XML4J EA2 --> Xerces-J 1.0 References: <1B79E83E7849174A813044A2E56F78040C09@AROD.iunknown.com> Message-ID: <008301bf3e16$f2d4d3e0$0a00a8c0@orconet.com> John, All of the XML4J development team is now working on Xerces. There may be some additional IBM only features that get done and put in an XML4J release on alphaWorks, but any feature that is of general interest will go into the Xerces-J code base. Ted IBM XML Technology Group ----- Original Message ----- From: John Lam <jlam@iunknown.com> To: <xml-dev@ic.ac.uk> Sent: Tuesday, November 30, 1999 4:53 PM Subject: RE: XML4J EA2 --> Xerces-J 1.0 Will IBM continue development of XML4J independently of Xerces-J? Or will Xerces-J be the "official" version of that source code base? -John xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Sat Dec 4 05:48:05 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:21 2004 Subject: DOM ECMAScript Test Suite In-Reply-To: <015501bf3dd2$ec275390$293b0681@ncsl.nist.gov> Message-ID: <000b01bf3e1b$317d5640$099918d1@docuverse1> >I've just updated our DOM ECMAScript test suite, available from > > http://www.nist.gov/xml/ > >Click on DOM Test Suite. This suite includes ~900 ECMAScript >tests that exercise the DOM Level 1 Fundamental, Extended, and Very very useful, Mary. Java-binding will be nice as well. Any plan on DOM Level 2 conformance testing? Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Sat Dec 4 06:15:22 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:21 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com> Message-ID: <001201bf3e1f$074601c0$099918d1@docuverse1> Walter, Could you elaborate your decision to use PI rather than element(s)? Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Dec 4 07:38:13 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:21 2004 Subject: SGML the next big thing? Message-ID: <00fc01bf3e2d$f508d370$55f96d8c@NT.JELLIFFE.COM.AU> > From: Arnold, Curt <Curt.Arnold@hyprotech.com >Erlend Xverby [Erlend.Overby@usit.uio.no] wrote > >>>What we don't need from the SGML standard: >>> - SGML Declaration >>> - Character entities >>> - Minimisation >>> - The "&" construct > >It looks like the XML Schema group is trying to add back the & construct. XFM looks to me like a kind of SGML Declaration, in that it says which features are needed to process a document. Similarly there is a well-known need in XML to allow non-standard characters/glyphs (for mathematics, advertising and Chinese especially) so it is not impossible that there may be increased development (whether looking like entities or numeric character references) towards better character entities too. That only leaves minimization. That is perhaps the nicest thing in HTML, and perhaps will be the only SGML thing that won't be reintroduced in the fullness of time. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Sat Dec 4 07:41:30 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:21 2004 Subject: YML: A Grand Unification of SAX and DOM? (fwd) In-Reply-To: <Pine.LNX.4.10.9912030214230.18460-100000@cauchy.clarkevans.com> Message-ID: <001301bf3e2b$085ea740$099918d1@docuverse1> Clark, What is the advantage of YML over these solutions? 1. Pockets <element> <pocket:attributes> <att> <ch>val</ch> </att> </pocket:attributes> <pocket:children> <foo> <pocket:text>bar</pocket:text> </foo> </pocket:children> </element> 2. Parental Guidence <element> <sax:cache> <foo>bar</foo> </sax:cache> </element> 3. Road Signs <element> <sax:cache>true</sax:cache> <foo>bar</foo> <sax:cache>false</sax:cache> </element> These are preliminary XML design patterns so pattern names are weird to say the least. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Sat Dec 4 08:06:00 1999 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 17:18:21 2004 Subject: Problem for mathematically minded XML experts Message-ID: <043f01bf3e2e$90bd3540$5fc466c3@unet.com> I would like to pose the following question to those of you with a mathematical bent who have some spare time (I'm not expecting the answer quickly - this is a holiday teaser I suspect!). Given I have two DTDs, in which there are two elements whose models can be described as follows: Element DTD1 consists of a sequence of E1 elements and G1 OR groups, where the total number of elements in the OR groups is N1 and Element DTD2 consists of a sequence of E2 elements and G2 OR groups, where the total number of elements in the OR groups is N2 is there a formula that can be used to determine whether the same pair of elements are valid in both DTD1 and DTD2? If there is, is there a way to determine the difference caused by the following conditions being added: a) there need be no constraint on the order of the elements b) the elements must be in a particular order c) the elements must be adjacent, in any order d) the elements must be adjacent, in a particular order. Does the split of the number of elements in each OR group affect the calculation significantly? Does the fact that one or other, or both of the elements is a member of a group significantly affect the calculation? How would the calculation change if it was required that three matches were required from the model? If you can help me understand any part of the problem, or point me to a paper/book on the subject, I would be grateful. Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Dec 4 09:10:42 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:21 2004 Subject: Problem for mathematically minded XML experts Message-ID: <012101bf3e3a$ef24b9d0$55f96d8c@NT.JELLIFFE.COM.AU> Murata Makoto of FujiXerox has been working on the question of set operations on grammars for several years. Rick Jelliffe -----Original Message----- From: Martin Bryan <mtbryan@sgml.u-net.com> To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk> Date: Saturday, 4 December 1999 16:24 Subject: Problem for mathematically minded XML experts >I would like to pose the following question to those of you with a >mathematical bent who have some spare time (I'm not expecting the answer >quickly - this is a holiday teaser I suspect!). > >Given I have two DTDs, in which there are two elements whose models can be >described as follows: > >Element DTD1 consists of a sequence of E1 elements and G1 OR groups, where >the total number of elements in the OR groups is N1 > >and > >Element DTD2 consists of a sequence of E2 elements and G2 OR groups, where >the total number of elements in the OR groups is N2 > >is there a formula that can be used to determine whether the same pair of >elements are valid in both DTD1 and DTD2? If there is, is there a way to >determine the difference caused by the following conditions being added: > >a) there need be no constraint on the order of the elements >b) the elements must be in a particular order >c) the elements must be adjacent, in any order >d) the elements must be adjacent, in a particular order. > >Does the split of the number of elements in each OR group affect the >calculation significantly? >Does the fact that one or other, or both of the elements is a member of a >group significantly affect the calculation? >How would the calculation change if it was required that three matches were >required from the model? > >If you can help me understand any part of the problem, or point me to a >paper/book on the subject, I would be grateful. > >Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK >Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To unsubscribe, mailto:majordomo@ic.ac.uk the following message; >unsubscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Dec 4 09:09:36 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:21 2004 Subject: Content models considered bad...errr sometimes (was: Re: SGML the next big thing?) Message-ID: <011c01bf3e3a$8bc87a20$55f96d8c@NT.JELLIFFE.COM.AU> From: Liam R. E. Quin <liamquin@interlog.com> > This & thing is so far outside the way most other computer languages > work that standard off-the-shelf parser generators roll on their > backs and wave their paws in the air and admit defeat. I am interested if you think this also reveals anything about the persistent claims that SGML is bad because is doesn't conform to the expectations of computer science (as influenced by an early generation of tools such as YACC). I would tend towards the view that uncritical acceptence of academic paradigms has held SGML/XML development up. In the case of XML (and SGML, which is really a compiler compiler, though with a different target to YACC and Lex) I think the view that a schema should be viewed as a language definition is holding things back (which is *not* to say that there is no benefit in being able to implement a schema as a language, or that there is no benefit in being able to reason about a schema using formal language theory). No-one says "Windows, Icons, Menus and Popups are not easy to implement in YACC, so we should not have them": in fact, in the 90s, the trend for specifying GUIs has been solidly away from formal grammatical descriptions of the total interface language, even if just for flexibility. >(3) The & connector interacts with #PCDATA to form pernicious content > models (see below). The XML WG went to great lengths to make sure > that no valid XML document suffers from this SGML bogosity. Similar > lengths are needed for "&". Paul Prescod had an excellent idea a while back for adding a #WS particle that explicitly modelled whitespace. That would get rid of most problems, but it I presume there would still be an ambiguity possible with (#PCDATA | #WS ) But outside all this there is the basic issue of whether content models actually are good to be the only direct mechanism for implementing data models in XML: if the idea of namespaces is to allow ad hoc inclusion of elements from different domains at the user discretion, the idea that a schema should be a language description becomes less and less convincing. How useful is "," when we might want to interpose elements from any other namespace anywhere, for example? For example, here is your content model, followed by a Schematron schema. I would say that the Schematron schema captures much more directly what the content model might be modeling: in fact, the content model establishes relationships but fails to provide what they mean. > <!ELEMENT boy > (noise & (dirt,mud)+ & (mud,shoes,trouble)* & #PCDATA) +smell <schema> <pattern name="A Boy"> <rule context="boy"> <assert test="count(noise)=1">Boys need noise</assert> <assert test="dirt">Boys need dirt</assert> <assert test="mud">Boys need mud</assert> <assert test="count(mud)=count(dirt) + count(shoes)" >Some mud comes from dirt and some mud comes from shoes.</assert> <assert test="count(shoes)= count(trouble)" >A boy will have as much trouble as he has muddy shoes.</assert> </rule> <rule context="smell"> <assert test="ancestor::boy">Boys can smell</assert> </rule> <rule context="boy/trouble"> <assert test="previousSibling::shoes">Muddy shoes lead to trouble</assert> <assert test="count(mud)=count(dirt) + count(trouble)" >The mud that comes from dirt is independent of the mud that causes trouble</assert> </rule> <rule context="boy/shoes"> <assert test="previousSibling::mud">A boy's shoes must be muddy</assert> </rule> <rule context="boy/dirt"> <assert test="followingSibling::mud">All dirt leads to mud</assert> <assert test="name(followingSibling::*[position()=1])='smell' | name(followingSibling::*[position()=1])='mud'" >Dirt must be followed by mud or smells</assert> </rule> </pattern> </schema> Other rules could be added to capture the intricacies of the inclusion, but the question should be asked whether the content model captures the intent of the schema developer more than the Schematron schema does: to what extent does the elegence of regular expressions force decisions to be made that are extraneous to modeling requirements, i.e. that are merely artifacts of the notation/paradigm. I think that a good number of the people who claimed dislike for DTDs will find that really their problem is with regular grammars. Of course, the people who need to convert from class-based data into XML will find XML Schema's provisions of inheritence or class mechanisms very useful, but that still won't help matters if the relationship between elements is important. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at qub.com Sat Dec 4 09:09:45 1999 From: paul at qub.com (Paul Tchistopolskii) Date: Mon Jun 7 17:18:21 2004 Subject: YML: A Grand Unification of SAX and DOM? (fwd) References: <001301bf3e2b$085ea740$099918d1@docuverse1> Message-ID: <028b01bf3e37$8af1f7a0$5df5c13f@PaulTchistopolskii> There is also *very* elegant 'reverse-polish-notation' approach proposed by Robert ( process element when Grove is in place, providing the execution stack ). Not sure he was talking about the execution stack, it was my attempt to understand how could it work. The only drawback of such a view is that the execution stack constantly grows and we need to clean it up sometimes. However. Because mutithreading approach should have the same drawback, I think that the workaround should already exist in the source code of SP ( thanks to Sean for pointing that SP is an existing implementation of multithreading approach ). No namespaces, no extra markup - just smart cleanup ( could be easier than look-ahead, because the information to make a descision is already 'in place', right? ) Rgds.Paul. > Clark, > > What is the advantage of YML over these > solutions? > > 1. Pockets > > <element> > <pocket:attributes> > <att> > <ch>val</ch> > </att> > </pocket:attributes> > <pocket:children> > <foo> > <pocket:text>bar</pocket:text> > </foo> > </pocket:children> > </element> > > 2. Parental Guidence > > <element> > <sax:cache> > <foo>bar</foo> > </sax:cache> > </element> > > 3. Road Signs > > <element> > <sax:cache>true</sax:cache> > <foo>bar</foo> > <sax:cache>false</sax:cache> > </element> > > These are preliminary XML design patterns > so pattern names are weird to say the least. > > Best, > > Don Park - mailto:donpark@docuverse.com > Docuverse - http://www.docuverse.com > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Dec 4 09:43:09 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:21 2004 Subject: YML: A Grand Unification of SAX and DOM? (fwd) Message-ID: <013201bf3e3f$5eea7a30$55f96d8c@NT.JELLIFFE.COM.AU> From: Don Park <donpark@docuverse.com> > <element> > <pocket:attributes> > <att> > <ch>val</ch> > </att> > </pocket:attributes> > <pocket:children> > <foo> > <pocket:text>bar</pocket:text> > </foo> > </pocket:children> ></element> Compare to XML less RSI-inducing <pocket ch="val"><foo>bar</foo></pocket> Note that the XSL pattern to find the attribute ch of element pocket is "pocket/@ch" for the XML but "element/pocket:children/../pocket:attributes/att/ch" for the alleged SML. It could be said that one could use "element/pocket:attributes/att/ch" but then there is the validation possibility where the pocket:attributes elements are made part of some other element. Of course, it would be possible to make an implementation of an XSL processor that interprets "pocket/@ch" as "element/pocket:children/../pocket:attributes/att/ch" and hides this from the user. It means that instead of looking at a type field in the parse tree, the name is used (presumably a better implementation method would be to translate the alleged SML into standard XML DOM on import). But then the user would have to have in mind the XML markup when reading the alleged SML, which is an additional mental burdon. But I look forward to the development of SPaths, SXSL, SDOM, SML Schemas, SPointers, SLink, SInclusions, etcs. At best, SML will make it easier to get exactly where we are today anywhere. ><element> > <sax:cache> > <foo>bar</foo> > </sax:cache> ></element> Compare to XML: <foo sax:cache="active">bar</foo> When an effect follows element scope, it is better practise to use elements than PIs. Otherwise <?sax cache="on"?><foo>bar</foo><?sax cache="off"?> or perhaps <sax:cache><foo>bar</foo></sax:cache> ><element> > <sax:cache>true</sax:cache> > <foo>bar</foo> > <sax:cache>false</sax:cache> ></element> Compare to XML: <foo sax:cache="active">bar</foo> When an effect follows element scope, it is better practise to use elements than PIs. Otherwise <?sax cache="true"?><foo>bar</foo><?sax cache="false"?> but of course we cannot comment to deeply on a snapshot. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rev-bob at gotc.com Sat Dec 4 09:58:04 1999 From: rev-bob at gotc.com (rev-bob@gotc.com) Date: Mon Jun 7 17:18:21 2004 Subject: [OT] Apologies for the inconvenience.... Message-ID: <199912040457971.SM01128@Unknown.> Sorry about the off-topic post, but I'll make it short. To the West Coast DBer who contacted me Friday evening - you have my attention, but I initially dismissed your email as buckshot and hence trashed it prematurely. Please re-send said message for mutual contact information. (For obvious reasons, I could not send this privately, and since this was the forum of initial contact, this is the lowest-bandwidth way of re-establishing contact. Again, I apologize to the rest of the list members.) Rev. Robert L. Hood | http://rev-bob.gotc.com/ Get Off The Cross! | http://www.gotc.com/ Download NeoPlanet at http://www.neoplanet.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Dec 4 19:01:35 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:21 2004 Subject: YML: A Grand Unification of SAX and DOM? (fwd) In-Reply-To: <013201bf3e3f$5eea7a30$55f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <Pine.LNX.4.10.9912040200070.24207-100000@cauchy.clarkevans.com> Don's examples didn't demonstrate recursion, and this is the meat of the proposal. On Sat, 4 Dec 1999, Rick Jelliffe wrote: > From: Don Park <donpark@docuverse.com> > > <element> > > <pocket:attributes> > > <att> > > <ch>val</ch> > > </att> > > </pocket:attributes> > > <pocket:children> > > <foo> > > <pocket:text>bar</pocket:text> > > </foo> > > </pocket:children> > ></element> > > Compare to XML less RSI-inducing > <pocket ch="val"><foo>bar</foo></pocket> > > Note that the XSL pattern to find the attribute ch of element > pocket is "pocket/@ch" for the XML but > "element/pocket:children/../pocket:attributes/att/ch" > for the alleged SML. It could be said that one could use > "element/pocket:attributes/att/ch" but then there is the > validation possibility where the pocket:attributes elements > are made part of some other element. Yet another reason for the distinction being set at the syntax level, as it currently is with XML. ;) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Dec 4 18:52:10 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:21 2004 Subject: [YML] RE: YML: A Grand Unification of SAX and DOM? (fwd) In-Reply-To: <001301bf3e2b$085ea740$099918d1@docuverse1> Message-ID: <Pine.LNX.4.10.9912040110310.24207-100000@cauchy.clarkevans.com> On Fri, 3 Dec 1999, Don Park wrote: > What is the advantage of YML over these > solutions? > 3. Road Signs > > <element> > <sax:cache>true</sax:cache> > <foo>bar</foo> > <sax:cache>false</sax:cache> > </element> This structure does not seem to be recursive, given that I'm interpreting it as I would a processing instruction. > 2. Parental Guidence > > <element> > <sax:cache> > <foo>bar</foo> > </sax:cache> </element> This *could* be logically equivalent depending upon your interpretation. Consider this instead: <element> <sax:cache> <foo> <bar/> </foo> </sax:cache> </element> If the processor would provide random access to <bar/>, then the answer is no -- this is not the same as YML. If, however, you use a keyword like this to denote that the _immediate children_ are placed into random access storage, then the answer is almost. In addition, a mechanism is required to gaurentee that all of the random access children occur before the first sequential child. > 1. Pockets > > <element> > <pocket:attributes> > <att> > <ch>val</ch> > </att> > </pocket:attributes> > <pocket:children> > <foo> > <pocket:text>bar</pocket:text> > </foo> > </pocket:children> > </element> First, I don't get what you had intended with <pocket:text>bar</pocket:text>, so let's consider this replaced with <text>bar</text> to proceed. If attributes/children means random/sequential access, then, of the three, this is the closest since there seems to be an implicit requirement that all random access children occur before sequential access children. However, you did not use this construct recursively -- <ch> and <text> were not marked with the sequential/random access distinction. If this distinction were embedded into a binary syntax, then this type of problem would not occur. So, here is the YML version of the above, assuming sequential access for <ch> and random access for <bar>. <element <att> <ch>val</ch> </att> > <foo <text>bar</text> /> </element> Or, using syntax sugar: <element att=<ch>val</ch> > <foo text="bar" /> </element> Major advantage here is that the doubly recursive syntax drives the processing choice as to collect attributes together and provide them via random access (DOM), or to provide them individually via sequential access (SAX). I hope this moves things along! Best Wishes, Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Dec 4 18:58:16 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:21 2004 Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd) In-Reply-To: <028b01bf3e37$8af1f7a0$5df5c13f@PaulTchistopolskii> Message-ID: <Pine.LNX.4.10.9912040155210.24207-100000@cauchy.clarkevans.com> Paul, I didn't get this at all. Sorry. On Sat, 4 Dec 1999, Paul Tchistopolskii wrote: > There is also *very* elegant > 'reverse-polish-notation' approach > proposed by Robert ( process > element when Grove is in place, > providing the execution stack ). > > Not sure he was talking about the > execution stack, it was my attempt to > understand how could it work. > > The only drawback of such a view > is that the execution stack constantly > grows and we need to clean it up > sometimes. > > However. > > Because mutithreading approach should > have the same drawback, I think that the > workaround should already exist in the > source code of SP ( thanks to Sean for > pointing that SP is an existing implementation > of multithreading approach ). > > No namespaces, no extra markup - > just smart cleanup ( could be easier > than look-ahead, because the information > to make a descision is already 'in place', > right? ) I'm talking about using a low-level recursive binary distinction in syntax to unify the behavior of SAX and DOM -- without *any* schema knowledge of the input stream known by the parser author, nor requiring any external processing guidelines. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sat Dec 4 21:09:19 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:21 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: Your message of "Fri, 03 Dec 1999 13:13:10 EST." <14408.2102.136185.50050@localhost.localdomain> Message-ID: <199912042109.OAA19150@localhost.localdomain> > > I would really like you to reconsider this ordering of priorities. > > SAX2 is urgently needed for DOM implementors and developers of XSLT > > engines with streaming output. You have done an admirable job of > > leading the SAX and SAX2 development, and it is dying without your > > output. Now if you were simply buried with 9-5 work, and couldn't > > lend your efforts, it would be understandable. But to detract from > > SAX2 in order to focus on SAX/C++, I think is muddling the > > priorities. > > What in SAX2 is most urgently needed for DOM and XSLT? I know that > DOM level one *can* support some things that SAX doesn't report (such > as comments and CDATA section boundaries), but there is nothing in DOM > level one that says those have to be included, and I've heard of > relatively few real-world applications that need that information. > > I'm not as familiar with the situation in XSLT, and information would > be helpful. When I think about it more clearly, it is really the XSLT needs that stick out. 4XSLT uses DOM to process XSLT and uses SAX2 Alpha for output. The sorts of things that need to be addressed in SAX for XSLT output include namespaces, Doctype declarations, and comments. Note that 4DOM implements DOM Level 2 core, and thus we can support all the advanced features we need in the DOM, but SAX 1.0 falls short for input and output. If we can agree on general interfaces for SAX2, the Python community can take care of itself and develop the Python binding. I don't see why the much larger C++ community can't do its own work as well. We should be working on general, language-independent interfaces for XML development and let the language-specific needs catch up as needed. Reference implementations and all that are fine, and so I have no problem with SAX being specified in Java. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sat Dec 4 21:12:42 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:21 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: Your message of "Fri, 03 Dec 1999 20:11:53 GMT." <006501bf3dca$a4767ce0$4a5eedc1@arp01> Message-ID: <199912042112.OAA19167@localhost.localdomain> > > What in SAX2 is most urgently needed for DOM and XSLT? I know that > > DOM level one *can* support some things that SAX doesn't report (such > > as comments and CDATA section boundaries), but there is nothing in DOM > > level one that says those have to be included, and I've heard of > > relatively few real-world applications that need that information. > > How about focusing on SAX/2, and making the first C/C++ SAX interface > actually SAX 2 so we kill two birds with one stone ? This to me seems the most sensible approach. Especially when SAX2 has had so much discussion and is potentially so close to completion. The C++/SAX discussion is just starting and could go on for months. It might as well be built around an up-to-date standard. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From KenNorth at email.msn.com Sat Dec 4 21:09:10 1999 From: KenNorth at email.msn.com (KenNorth) Date: Mon Jun 7 17:18:21 2004 Subject: Security alerts: XML redirect in IE 5.0, MiniZip worm References: <199912040457971.SM01128@Unknown.> Message-ID: <004e01bf3e9b$ac8df260$0b00a8c0@grissom> Earlier this week I sent a security alert to each xml-dev member whose e-mail address was in my IN basket. I received an e-mail worm earlier in the week and sent a warning not to open an attachment named ZIPPED_FILES.EXE. The MiniZip worm propagates by mailing itself so I thought it best to err on the side of caution and send a warning. (I don't know whether MiniZip leaves a copy of sent messages in the Sent folder.) Members should also be aware of an Internet Explorer 5.0 security problem related to XML redirects: http://www.ntsecurity.net/go/load.asp?iD=/security/IE54.htm xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sat Dec 4 21:16:09 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: Your message of "Fri, 03 Dec 1999 13:21:38 EST." <14408.2610.245842.199581@localhost.localdomain> Message-ID: <199912042116.OAA19183@localhost.localdomain> > I'd like to hear what others think on this issue. There was some > interest in SAX2 when I posted my alpha interfaces a few months back > (most notably, but not exclusively, from David Brownell), but it was > hardly a tidal wave. On the other hand, I am noticing a building > pressure from implementors to get something out in C++. Have you considered that the lack of heavy discussion after you posted the SAX2 alpha was because of its high quality? There is little to argue with. The Python/XML group was able to hammer out a Python binding based on your alpha in short order. We have put this to practical use in 4XSLT, and it works very well. I don't think there needs to be a lot more dicussion on the way to SAX 2.0, but I hardly think that minimizes its importance. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at qub.com Sun Dec 5 00:08:33 1999 From: paul at qub.com (Paul Tchistopolskii) Date: Mon Jun 7 17:18:22 2004 Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd) References: <Pine.LNX.4.10.9912040155210.24207-100000@cauchy.clarkevans.com> Message-ID: <006f01bf3eb4$a0c71420$5df5c13f@PaulTchistopolskii> > Paul, I didn't get this at all. Sorry. I think it's because you are concentrated on another task than I am. I'm thinking about mixing streaming and Groves for processing XML ( SML ) documents. > On Sat, 4 Dec 1999, Paul Tchistopolskii wrote: > > There is also *very* elegant > > 'reverse-polish-notation' approach > > proposed by Robert ( process > > element when Grove is in place, > > providing the execution stack ). > > > > Not sure he was talking about the > > execution stack, it was my attempt to > > understand how could it work. > > > > The only drawback of such a view > > is that the execution stack constantly > > grows and we need to clean it up > > sometimes. > > > > However. > > > > Because mutithreading approach should > > have the same drawback, I think that the > > workaround should already exist in the > > source code of SP ( thanks to Sean for > > pointing that SP is an existing implementation > > of multithreading approach ). > > > > No namespaces, no extra markup - > > just smart cleanup ( could be easier > > than look-ahead, because the information > > to make a descision is already 'in place', > > right? ) > > I'm talking about using a low-level recursive > binary distinction in syntax to unify the > behavior of SAX and DOM -- without *any* > schema knowledge of the input stream known > by the parser author, nor requiring any > external processing guidelines. Your approach is : "if we'l write our document providing some extra information, it'l be easier for processing API to make a desision how to process it". Even I found your proposal to be very elegant, I dont like that idea in principle. It's the attributish way when one is marking 'road-signs' or 'pockets' in the document. Document sould be about the content, not about the 'road-signs' , PI's, and some other stuff Stylesheet is about processing ;-) I'l prefer to attach the 'road-signs' at runtime. I see 2 ways for now to change processing of the XML ( SML ) documents not changing the documents themselvs. simple SAX-based 'switcher' reverse-polish-noitation view After I'l understand what way makes life easier for streaming XSLT I may write more. It is all getting hard. Rgds.Paul. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sun Dec 5 00:23:00 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:22 2004 Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd) In-Reply-To: <006f01bf3eb4$a0c71420$5df5c13f@PaulTchistopolskii> Message-ID: <Pine.LNX.4.10.9912040713090.24951-100000@cauchy.clarkevans.com> On Sat, 4 Dec 1999, Paul Tchistopolskii wrote: > I think it's because you are concentrated on > another task than I am. I'm thinking about > mixing streaming and Groves for processing > XML ( SML ) documents. Sounds similar enough... and our end goal is the same, a more efficient XSL processor. > Your approach is : "if we'l write our document > providing some extra information, it'l be easier > for processing API to make a desision how > to process it". Let me re-prase: "if we design our documents in such a way that the information dependencies are identified, and we use a syntax to demark these dependencies, then a parser can better support the processor by providing either sequential or random access depending upon the context." > Even I found your proposal to be very elegant, > I dont like that idea in principle. It's the attributish > way when one is marking 'road-signs' or 'pockets' > in the document. Document sould be about the content, > not about the 'road-signs' , PI's, and some other stuff > > Stylesheet is about processing ;-) Far enough.. but I would say that stylesheets are about transforming, not about providing the information in an accessable way that supports the dependencies. > I'l prefer to attach the 'road-signs' at runtime. Well, you will have do to this based on some distinction. I'd be interested to see what you pick in the end. I'm putting it at the syntax level so that the designers of the content can have control over it. > I see 2 ways for now to change processing of the > XML ( SML ) documents not changing the documents > themselvs. > > simple SAX-based 'switcher' > reverse-polish-noitation view > > After I'l understand what way makes life easier for > streaming XSLT I may write more. It is all getting > hard. Cool. I'd like to hear more about the RPN view. This sounds interesting. ;) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Sun Dec 5 01:29:51 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:22 2004 Subject: Request for Discussion: SAX 1.0 in C++ References: <199912042112.OAA19167@localhost.localdomain> Message-ID: <001301bf3ec0$d47d7d20$5afbb1cd@tomshp> From: <uche.ogbuji@fourthought.com> > > > What in SAX2 is most urgently needed for DOM and XSLT? I know that > > > DOM level one *can* support some things that SAX doesn't report (such > > > as comments and CDATA section boundaries), but there is nothing in DOM > > > level one that says those have to be included, and I've heard of > > > relatively few real-world applications that need that information. > > > > How about focusing on SAX/2, and making the first C/C++ SAX interface > > actually SAX 2 so we kill two birds with one stone ? > > This to me seems the most sensible approach. Especially when SAX2 has had so > much discussion and is potentially so close to completion. The C++/SAX > discussion is just starting and could go on for months. It might as well be > built around an up-to-date standard. > > -- > Uche Ogbuji I'd second this, with the thought that most of the effort would concentrate on finishing the SAX2 interface itself before spending the potential "months" on the C++/SAX implementation. Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Sun Dec 5 16:28:02 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <Pine.GHP.4.21.9912032153220.11243-100000@mail.ilrt.bris.ac.uk> Message-ID: <384A9281.B395AAA@praxis.cz> Dan Brickley wrote: > I believe it will be possible to annotate XML schemas with information > for mapping into (generic or domain specific) application datamodels > such as RDF. I don't think it is right to expect the hard-pressed XML > Schema group to define all these mappings within that working group. > But that doesn't matter; all we need is a placeholder for such > information. I totally agree. As long as these considerations are being taken into account, I'm sure there will be plenty of people experimenting with various approaches. This will certainly lead to a better understanding of how to address these issues than simply mandating something that was worked out by a committee. > My understanding of the Cambridge Communique meeting was that we reached > agreement on just this. See points 1-6 under '3. Observations and > Recommendations' in http://www.w3.org/TR/1999/NOTE-schema-arch-19991007 <snip> The need to develop an abstract schema for representing objects and properties is very clear; one of the problems people have with understanding RDF is that this need seems so obvious that they assume XML already fills it. The real question is whether a separate RDF syntax is the appropriate way to do this. I see a lot of value in seeing this information as an extension of the information currently provided in an XML schema (i.e. basically a serialization of the XML infoset). The overlap is great because both RDF schemas and XML schemas are working with the same basic informational units (assuming you accept the mapping of class -> element type). The RDF model also seems awfully complex for normal mortals. If the stated target were knowledge management specialists, then there is clearly an important niche market for a very complete mechanism for specifying semantic relationships between resources. If we are talking about the standard mechanism for object interchange on the Web, a simpler mechanism (adding, for example, only the notion of properties and strong datatyping) implemented inside an XML schema has a much greater chance of being widely accepted. Of course, I'm only guessing that XML schemas will be very widespread anyway, so there's plenty of room for disagreement. > Sure, you could do this. My hunch is that the urge to do this won't be > as strong when we have more abstract (objects and properties) interfaces > to XML content, rather than our current APIs that obsess on detail of > particular serialisations rather than on what those serialisations have > told us about the objects. If we could get to a world where generic > rather than domain interfaces being useful to even 10% instead of 5% of > applications (to borrow your figure), that'd be a huge win. Interesting insight. I see your point, but I also see this also supporting the argument for making any XML instance a potential object by using the associated schema for conveying information about object properties. This would mean that there would be a "new DOM" only works on valid instances. If you do have a schema, it should be possible to exploit this directly by having better generic interfaces, rather than trying to treat well-formed and valid instances in the same way. > There is also a need to know the objects'n'properties view of the data > without going to fetch (or having advance knowledge of) the > syntactic schema or serialisation policy. RDF's > initial syntax was one approach; there have been and will be > others. The Microsoft folks were for a while throwing around some > interesting ideas on mapping more 'colloquial' XML syntax into directed > labelled graphs. There's a version at http://www.biztalk.org/Resources/canonical.asp > for example. Cool, I will have to take a much closer look at that. It seems to be very close to what I am talking about. Thanks for the tip. > I've also heard that some folks want to use it for structured hypertext > documents... > > (One consequence of XML's document heritage is that document order is > generally treated as meaningful and in need of preservation. This can be > a pain in the butt for data-centric apps.) Fair enough, but the potential for object interchange is what is getting people really excited. Nothing about having object facilities in the XML schema language precludes the use of XML for structured documents. But if we are talking about a web "vision", the potential for easy interchange of data between applications is more likely to have a revolutionary impact than the use of structured documents and stylesheets. I also think that many of these object facilities will actually turn out to be very useful for what are normally considered to be documents. > I don't see a conflict here. RDF is happy with multiple ways of shipping > I'd love to see examples of an annotated XML Schema that shows how to > derrive an objects'n'properties view of instance data. I am going to have to write up something about this. Stay tuned... Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Sun Dec 5 16:32:19 1999 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz> <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com> Message-ID: <384A9384.678B366C@praxis.cz> James Tauber wrote: > Are you achieving this by expressing how certain element types relate to > other element types and to concepts? A semantic network? > > If so, you are still ultimately relating the elements to concepts you are > probably going to define by human prose or running code. > > I'm not arguing with this idea. I think it probably has some promise. But > the real semantics are ultimately introduced into the system by agreed to > concepts that aren't expressed via schemata. A schema is part of the > picture, but not the whole. > > I'll go back and read your Web Vision post. As long as human beings are the only plausible "end consumers" of these documents, their semantics will always be determined ultimately by fuzzy things like intentions and expectations. The semantic constraints I am talking about are one step away from these "ultimate" semantics; they tell you that an integer contained in a given element cannot be greater than 100, but they don't tell you why. These are still semantics to me and they provide tremendous value when you want to process a broad range of documents generically (which might, for example, involve generating an application-specific interface for any arbitrary schema). Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Dec 5 17:45:48 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <003101bf3f4c$1b467550$09f96d8c@NT.JELLIFFE.COM.AU> From: Matthew Gertner <matthew@praxis.cz> >Dan Brickley wrote: >> I believe it will be possible to annotate XML schemas with information >> for mapping into (generic or domain specific) application datamodels >> such as RDF. I don't think it is right to expect the hard-pressed XML >> Schema group to define all these mappings within that working group. ... > >I totally agree. As long as these considerations are being taken into >account, I'm sure there will be plenty of people experimenting with >various approaches. This will certainly lead to a better understanding >of how to address these issues than simply mandating something that was >worked out by a committee. In this vein, schematron-rdf at http://www.ascc.net/xml/resource/schematron/schematron.html generates RDF documents (currently with bogus XLinks, but you can customize it easily) based on Schematron schemas. In this case, the schema is not converted to RDF, rather the RDF shows which assertions in the schema apply to each element in the instance. This is a rather different use for schemas: as programs for automated annotation. The thing that became immediately clear from working on it was that RDF is good for arcs (relationships) but grammar-based schemas largely hide these relationships (between elements, attributes, data) behind a few generic but superficial types: containment, sequence, repetition. Schematron assertions now allow a "role" attribute, for labelling classes of arcs. I think developers of other schema languages might also consider this kind of thing too: that the connectors between particles of patterns (e.g., compositors in the content models in a grammar-based schema language) should have some role attribute (and documentation?) for labelling their significance. For example, if element A must be follwed by element B, to say why. The nodes that conventional schemas define (e.g. elements and attributes) are interesting, but the arcs between them can also be very interesting for automatic annotation using RDF. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Dec 5 17:56:31 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <005001bf3f4d$9c82e940$09f96d8c@NT.JELLIFFE.COM.AU> From: Dan Brickley <Daniel.Brickley@bristol.ac.uk> >(One consequence of XML's document heritage is that document order is >generally treated as meaningful and in need of preservation. This can be >a pain in the butt for data-centric apps.) Perhaps a major part of the problem is that sometimes the document order is meaningful and other times just an artifact of there being no "&" connector in XML content models, and there is no way to decide. And when the order is important, there is no way to label what its significance is; indeed, the same thing is true of every axis including the children and parent axes. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Sun Dec 5 18:59:45 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <005001bf3f4d$9c82e940$09f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <384AB61C.D947D256@fiduciary.com> Is this not precisely the reason that 'behaviour' (or whatever we are eventually to call it) of XLinks is indispensable? Not as a replacement for the document-centric assumptions that text order is meaningful or that the implicit parent-child relationship of element containment is significant, but as the mechanism for specifying (granted, to a perhaps more data-oriented audience) either where these relationships should be explicit, or where they are replaced by explicitly presented alternatives. Respectfully, Walter Perry Rick Jelliffe wrote: > Perhaps a major part of the problem is that sometimes the document order > is meaningful and other times just an artifact of there being no "&" > connector in XML content models, and there is no way to decide. And > when the order is important, there is no way to label what its > significance is; indeed, the same thing is true of every axis including > the children and parent axes. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Dec 5 21:13:14 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <000601bf3f69$16947fd0$3bf96d8c@NT.JELLIFFE.COM.AU> Yes, except XLinks are specified in instances, not as a schema per se. I hope the XML Schema will have some extension mechanism to allow these kinds of thing, but who knows. It is true that sequence and containment relations between elements in a content model could be treated of as some kind of extended link. <xlink:extended > <xlink:locator href="http://xxx?xpointer=//boy/dirt" role="cause" /> <xlink:locator href="http://xxx?xpointer=//boy/dirt/followingSibling::mud" role="effect" /> </xlink:extended> (You could modify schematron-rdf to generate these kinds of XLinks pretty easily.) But the trouble with attempting to use XLinks to directly declare some part of an XML Schema is there is no way to nicely interact with content models using maxOccurs--if a dog has two eyes and we want to link between them and the two eyes are declared using a single <type name="eye" maxOccur="2" /> then we are sunk. We really want to link in the instance not in the schema. And we cannot use hrefs to the instances because we don't know what the instance document URI is: a URI identifies a particular resource not a class of resources. XLinks are not designed for us as schema declarations. So I think there needs to be first-class support for this in the schema language itself: in the case of XML Schemas, probably the most possible thing would be a role attribute (or some equivalent) on groups. There is not much there to hook onto. Any ideas on this would be welcome, even if just to help me think through the issues for Schematron. Rick Jelliffe From: W. E. Perry <wperry@fiduciary.com> >Is this not precisely the reason that 'behaviour' (or whatever we are eventually to call it) >of XLinks is indispensable? Not as a replacement for the document-centric assumptions that >text order is meaningful or that the implicit parent-child relationship of element containment >is significant, but as the mechanism for specifying (granted, to a perhaps more data-oriented >audience) either where these relationships should be explicit, or where they are replaced by >explicitly presented alternatives. > >Rick Jelliffe wrote: > >> Perhaps a major part of the problem is that sometimes the document order >> is meaningful and other times just an artifact of there being no "&" >> connector in XML content models, and there is no way to decide. And >> when the order is important, there is no way to label what its >> significance is; indeed, the same thing is true of every axis including >> the children and parent axes. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Colas.Nahaboo at sophia.inria.fr Mon Dec 6 00:08:28 1999 From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: Your message of "Fri, 03 Dec 1999 15:10:14 EST." <A51F7543E295D2118D6600A024CDB2F71B9D77@MAILPROD> Message-ID: <199912060005.BAA25394@koala.inria.fr> Vane Lashua writes: > What's the point of defining "Point"? I do not want to define it in XML (or rather, the language to occupy the ecological niche of XML). I just want to transport its data (instance contents). > "Point" is meaningless by itself -- even > though it may be syntactly correct, in a context, with normalized attributes > and values -- without a specific processor that understands what a "Point" > is. Of course!!! You want to separate irrelated problems if you want to stay aside from combinatorial complexity explosion... Understanding the semantics of data (classes, the typing system, and more...) belongs to the application (helped by a Schema language if a good one exist), *NOT* to the "transport" XML layer. If you want to invent a language to express the semantics of objects, well, good luck, but I dont want to wait for your effort to succeed before being able to transport my data contents :-) I want XML *NOT* to try to understand what it doesnt know about. -- Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 00:36:34 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:22 2004 Subject: simple XML for C++ application data-file I/O Message-ID: <384B04DA.DCD6BAED@fxtech.com> I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the solutions I've seen are very simple or straightforward for generic application data I/O (ie. non web, e-commerce, Java-type stuff). In other words, I'm about to roll my own, and would like to gauge interest in a small callback-based API for simple XML I/O. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml-dev at teleo.net Mon Dec 6 00:58:33 1999 From: xml-dev at teleo.net (Patrick Phalen) Date: Mon Jun 7 17:18:22 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: <384B04DA.DCD6BAED@fxtech.com> References: <384B04DA.DCD6BAED@fxtech.com> Message-ID: <9912051700340G.00844@quadra.teleo.net> [Paul Miller, on Sun, 05 Dec 1999] :: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the :: solutions I've seen are very simple or straightforward for generic :: application data I/O (ie. non web, e-commerce, Java-type stuff). In :: other words, I'm about to roll my own, and would like to gauge interest :: in a small callback-based API for simple XML I/O. Not sure what you mean. Are you talking about IPC, RPC? Have you looked at XML-RPC and SOAP? <Heh. Nine acronyms embedded in one brief msg.> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 01:08:39 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:22 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> Message-ID: <384B0C65.2A6710C0@fxtech.com> > :: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the > :: solutions I've seen are very simple or straightforward for generic > :: application data I/O (ie. non web, e-commerce, Java-type stuff). In > :: other words, I'm about to roll my own, and would like to gauge interest > :: in a small callback-based API for simple XML I/O. > Not sure what you mean. Are you talking about IPC, RPC? > Have you looked at XML-RPC and SOAP? I should have been more clear. I just want to use XML for simple non-web-bound application data files (document files). I need a non-validating parser that I can use to efficiently parse my application data, without all the complexity (and overhead) of something like DOM, but not as general-purpose as expat. > <Heh. Nine acronyms embedded in one brief msg.> Yeah, XML has definitely helped spawn plenty of new TLAs. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Dec 6 01:24:29 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz> <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com> <384A9384.678B366C@praxis.cz> Message-ID: <00c101bf3f88$bc0cc980$eb020a0a@bowstreet.com> > The semantic constraints I am > talking about are one step away from these "ultimate" semantics; they > tell you that an integer contained in a given element cannot be greater > than 100, but they don't tell you why. These are still semantics to me Ah. This is why I have have some difficulty understanding some of what you are saying. To me, the constraint that an integer cannot be greater than 100 is not semantics. It's syntax. MyInteger ::= ( '100' | digit{1,2} | '-' digit+ ) or in some more perspicuous grammar: MyInteger = Integer x : x <= 100 James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata.makoto at fujixerox.co.jp Mon Dec 6 02:18:23 1999 From: murata.makoto at fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:18:22 2004 Subject: Problem for mathematically minded XML experts Message-ID: <199912060220.AA03501@archlute.fujixerox.co.jp> >I would like to pose the following question to those of you with a >mathematical bent who have some spare time (I'm not expecting the answer >quickly - this is a holiday teaser I suspect!). I gave a talk at XTech'99 on automatic construction of intersection/union/difference of schemata (or DTDs). I even demonstrated my prototypical implementation. >is there a formula that can be used to determine whether the same pair of >elements are valid in both DTD1 and DTD2? If there is, is there a way to >determine the difference caused by the following conditions being added: If the intersection of two schemata (or DTDs) is not empty, there exists such elements. My slides and annotated log file of my demonstration are available at: http://www.geocities.com/ResearchTriangle/Lab/6259/XTech99/index.htm You might want to retrieve this single file which contains all HTML pages and image files. http://www.geocities.com/ResearchTriangle/Lab/6259/XTech99/xtech99.zip An introduction to hedge automata is available at: ttp://www.geocities.com/ResearchTriangle/Lab/6259/hedge_nice.pdf Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata.makoto@fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdr at camsoft.com Mon Dec 6 02:31:48 1999 From: sdr at camsoft.com (Stewart Rubenstein) Date: Mon Jun 7 17:18:22 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com> Message-ID: <384B2008.A8E62B7D@camsoft.com> I was easily able to use XML for exactly this sort of thing. For reading, I used James Clark's expat parser with Andy Dent's expatpp wrapper for C++, and it dropped in quite easily. You can get them both from <http://www.highway1.com.au/adsoftware/expatpp.htm>. Writing XML is almost too easy to bother getting help for. You do have to take some care if you're going to be dealing with text beyond US-ASCII. Fortunately, my OS's - MacOS and Windows - both have fairly decent Unicode support now. My application already has an object tree, so I just wrote the following in the base class, and implemented the obvious virtual functions in the subclasses that can exist in the tree: void CDXObject::XMLWrite(std::ostream &sink) const { // First write the opening tag and the attributes. sink << "<" << XMLObjectName() << std::endl; // The id is the only totally generic tag if (m_objectID != 0) sink << " " << kCDXML_id << "=\"" << m_objectID << "\"" << std::endl; // This is overridden by subclasses to write any object-specific attributes XMLWriteAttributes(sink); // If there's any if (!XMLNeedToWriteContent() && m_contents.empty()) sink << "/>"; else { sink << ">"; XMLWriteContent(sink); for (CDXObjectContentMap::const_iterator i = m_contents.begin(); i != m_contents.end(); ++i) GetObject(i)->XMLWrite(sink); // write each of the contained objects sink << "</" << XMLObjectName() << ">"; } } Paul Miller wrote: > > > :: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the > > :: solutions I've seen are very simple or straightforward for generic > > :: application data I/O (ie. non web, e-commerce, Java-type stuff). In > > :: other words, I'm about to roll my own, and would like to gauge interest > > :: in a small callback-based API for simple XML I/O. > > > Not sure what you mean. Are you talking about IPC, RPC? > > Have you looked at XML-RPC and SOAP? > > I should have been more clear. I just want to use XML for simple > non-web-bound application data files (document files). I need a > non-validating parser that I can use to efficiently parse my application > data, without all the complexity (and overhead) of something like DOM, > but not as general-purpose as expat. > > > <Heh. Nine acronyms embedded in one brief msg.> > > Yeah, XML has definitely helped spawn plenty of new TLAs. > > -- > Paul Miller - stele@fxtech.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Dec 6 03:40:20 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:18:22 2004 Subject: expat meant to be restartable? References: <38482AC7.95973198@fxtech.com> Message-ID: <38488A77.C4ACEDA@jclark.com> Paul Miller wrote: > > I don't think it is, but I want to check. I'd like to be able to reuse > an XML_Parser after I've called XML_Parse with isFinal set to 1. > Basically I want to go back and parse a subset of the original file, > using modified starting buffer pointer and length, but it doesn't seem > to work (I get a JUNK_AFTER_DOC_ELEMENT error). I would like to avoid > creating a new parser for each element subtree I scan. Expat doesn't support that. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Mon Dec 6 03:47:55 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <14408.2610.245842.199581@localhost.localdomain>; from David Megginson on Fri, Dec 03, 1999 at 01:21:38PM -0500 References: <14408.2610.245842.199581@localhost.localdomain> Message-ID: <19991206144718.D3543@io.mds.rmit.edu.au> On Fri, Dec 03, 1999 at 01:21:38PM -0500, David Megginson wrote: [Re: whether to work on SAX2 or on SAX/C++] > I can think of a few reasons that the world might desperately be > waiting for SAX2: > > 1. To get some kind of standard Namespace support (or at least a way > to tell whether a parser has Namespace support built in). > > 2. To query parser features in general. > > 3. To get at the stuff that SAX 1.0 doesn't report, like comments, > CDATA boundaries, and DTD declarations. > > I'm very interested in hearing other opinions. Having a standard > streaming interface stimulated a lot of development of reusable Java > XML processing components, and I'd like to see the same thing happen > in C++, but I need to hear what other people think the priorities > should be. #1 clearly is important; if only to ensure that SAX remains a desirable and viable interface. If application developers or parser writers are start to walk away from SAX due to a lack of namespace support, then SAX will rapidly die. #3 is vital for many XML *processing* applications. If you want to provide a SAX interface to an XML database server that must be able to round-trip documents, SAX 1.0 isn't enough. If you're writing an editor, or an XSLT engine, or a compound document manager, or a transport protocol like WBXML, you want or need to know about things that are in the SAX2 LexicalHandler (e.g. CDATA sections, comments), NamespaceHandler, and DeclHandler. For other applications, #3 isn't relevant. But that's the value of #2: parser writers can implement the features and support the properties they wish to, and application writers can selectively invoke that functionality. As it happens, I'm in the process of implementing a SAX interface for a couple of SIM-related projects. We need, I think, the functionality that SAX2 provides. Given that our code base is in C++, I guess my vote is for both: a stable SAX2 and a standard C++ definition. But having taken a look at SAX2, not much seems to be wrong with it. Whereas there's already close to a dozen SAX/C++ variants, and climbing. *That* trend needs to be stepped on, and quickly, before it gets out of hand. Michael ____________________________________________ http://www.mds.rmit.edu.au/~msf/ Multimedia Databases Group, RMIT, Australia. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Mon Dec 6 04:04:47 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++: Changes for C++ In-Reply-To: <14406.59075.218048.437305@localhost.localdomain>; from David Megginson on Thu, Dec 02, 1999 at 04:38:11PM -0500 References: <14406.59075.218048.437305@localhost.localdomain> Message-ID: <19991206150418.E3543@io.mds.rmit.edu.au> On Thu, Dec 02, 1999 at 04:38:11PM -0500, David Megginson wrote: > Here are some of the differences between the SAX/Java interfaces and the > SAX/C++ interfaces: > > - lots of const > - C++ const char * for Java String throughout (and, thus, UTF-8 > instead of UTF-16) > - InputSource doesn't have an equivalent of Java Reader (no getReader > method) I don't mind if the character container is unsigned short or wchar_t (it doesn't really matter if wchar_t is 32 bits on some platforms as it's easy enough to convert to/from where required), but put me down as another vote for UTF-16 rather than UTF-8. Given that the point of Unicode is to support I18N, why choose as a default a format that typically has a 50% size overhead for non-European languages? Many parsers and application happily work internally using UTF-16; why not standardize that as the default SAX character encoding? Suggestion: Do what the Java SAX interface did: optionally provide *both* ByteStream and CharacterStream components in an InputSource object Applications can treat the ByteStream as a stream of bytes whose encoding can either be auto-detected, or is explicitly indicated by the Encoding. However, a CharacterStream would always be a sequence of UTF-16 characters. > - SAXException does not allow an embedded exception, because there's > no need to tunnel exceptions in C++ (you can always throw any > exception) Unless you use throw() lists in function declarations; as did the Java spec. In which case, you need to be able to embed exceptions... > - DocumentHandler::characters and DocumentHandler::ignorableWhitespace > don't need the 'start' argument, since they can be passed a pointer > to the start position in an existing array (that's not possible in > Java) Yup. > - HandlerBase omitted, since the classes can contain their own default > implementations I think this has been covered by others; if we define SAX/C++ using abstract classes, then we need HandlerBase and the Impl classes back for convenience. > - I haven't figured out what to do with Parser::setLocale yet Michael ____________________________________________ http://www.mds.rmit.edu.au/~msf/ Multimedia Databases Group, RMIT, Australia. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Mon Dec 6 05:58:20 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:18:22 2004 Subject: Object-oriented serialization (Was Re: Some questions) References: <000601bf3f69$16947fd0$3bf96d8c@NT.JELLIFFE.COM.AU> Message-ID: <384B5077.47DA8C84@fiduciary.com> Ah, but the nuts and bolts of specific *behaviour* over links will have to be defined, ultimately as procedural code of some sort, in some generally invocable form. The logic (case, sequential, conditional) leading to the code which implements that behaviour might well--and probably ought to be--expressed as XML text, but at some point, from a leaf node of an XML document expressing a decision tree, it will be necessary to invoke procedural code to implement a defined behaviour. That procedural code might--and probably ought to be--parameterized by XML and designed to return XML, but of itself, as procedural code, it is opaque to the XML which invokes it. That procedural code, implementing a specific behaviour, is the 'class of resource' which you are looking for (as its opacity to the instance demonstrates). The behaviour expressed in that procedural code is invoked via a particular URI, but a unique process is instantiated only in the scope and context of the current document, and presumably only through parameterization specific to the instance, passed to generally available code. In fact, anything which might reasonably be described as behaviour is generally available to XML processing only as a 'class of resource': that generally available class must not be confused with its particular instantiation, for an instance invocation of that behaviour via a particular URI does not impair the availability of that URI--and of the behaviour it addresses--to other invocations from different contexts. Rick Jelliffe wrote: > Yes, except XLinks are specified in instances, not as a schema per se. > I hope the XML Schema will have some extension mechanism to > allow these kinds of thing, but who knows. > > It is true that sequence and containment relations between elements > in a content model could be treated of as some kind of extended link. [snip] > And we cannot use hrefs to the instances because we don't know > what the instance document URI is: a URI identifies a particular > resource not a class of resources. XLinks are not designed for > us as schema declarations. > > So I think there needs to be first-class support for this in the > schema language itself: in the case of XML Schemas, probably > the most possible thing would be a role attribute (or some equivalent) > on groups. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Mon Dec 6 07:19:04 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++: UTF-8 v UTF-16 In-Reply-To: <38472FE3.D3BB22BC@jclark.com>; from James Clark on Fri, Dec 03, 1999 at 09:50:11AM +0700 References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> Message-ID: <19991206181830.A11576@io.mds.rmit.edu.au> James Clark wrote: > David Megginson wrote: > > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility > > with most existing C++ code. > I would say there was at least as much C++ code using UTF-16 as using UTF-8. [...] > There are a couple of possible solutions: > > 1. A lo-tech solution. Provide a SAXChar typedef [...] > > 2. A hi-tech solution. [use templates] 3. Use a similar solution to the Java spec: provide both a ByteStream and a CharacterStream in InputSource, which has two benefits. One, it is consistent with the Java interface, which can't be a *bad* thing. Two, it frees us to define the CharacterStream explicitly as a conduit for UTF-16 encoded data, whilst allowing parsers/applications the freedom to use the ByteStream for data that is encoded in whatever format desired. The encoding can either be auto-detected, or can be explicitly identified using the InputSource setEncoding()/getEncoding() member function. This means going back to the two streams and the getEncoding()/setEncoding() methods of the original Java spec. This really seems like a Good Thing; I liked the look of it in the Java interface; why not use it here also? > If you feel that one needs to be mandated, I would pick UTF-16. Agreed. Michael ____________________________________________ http://www.mds.rmit.edu.au/~msf/ Multimedia Databases Group, RMIT, Australia. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Mon Dec 6 07:29:26 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++: First interface draft In-Reply-To: <38474BAF.AF4CFF2D@jclark.com>; from James Clark on Fri, Dec 03, 1999 at 11:48:47AM +0700 References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <19991206182900.F3543@io.mds.rmit.edu.au> On Fri, Dec 03, 1999 at 11:48:47AM +0700, James Clark wrote: > Here's another draft, with this change and a few other minor changes; [...] > - solve the UTF-8/UTF-16 problem by having two namespaces: As I've suggested elsewhere, this can also be (partially) addressed by providing both a CharacterStream and a ByteStream. That would change the InputSource definition to: class InputSource { public: virtual SAXString getPublicId () const = 0; virtual void setPublicId (SAXString publicId) = 0; virtual SAXString getSystemId () const = 0; virtual void setSystemId (SAXString systemId) = 0; virtual SAXString getEncoding () const = 0; virtual void setEncoding(SAXString encoding) = 0; virtual std::istream * getByteStream () const = 0; virtual void setByteStream (std::istream * in) = 0; // Issue: is wistream the best C++ cchoice for a Unicode "character" stream, // given that sizeof(wchar_t) need not be 2 (eg, under // Sun/Solaris CC)? virtual std::wistream * getCharacterStream () const = 0; virtual void setCharacterStream (std::wistream * in) = 0; private: void operator delete (void *); }; > Discussion points: > > - Would it be better to typedef SAXString to the Standard C++ string > class (ie std::basic_string<SAXChar>)? Also: The Java definition explicitly indicates what exceptions may be thrown through out the interface. Should C++ exception specificiers be used to mirror those semantics? If that's the case, we probably also need to add embedded exceptions back into the SAXException class, a la: virtual std::exception& getException() const = 0; virtual SAXString toString() const = 0; General query: should there be heavier use of const and "&" amongst the various function's parameter declarations and return values? Michael ____________________________________________ http://www.mds.rmit.edu.au/~msf/ Multimedia Databases Group, RMIT, Australia. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 08:35:08 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++: First interface draft In-Reply-To: David Megginson's message of "03 Dec 1999 09:16:28 -0500" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain> Message-ID: <whaenoo4b2.fsf@viffer.oslo.metis.no> >>>>> David Megginson <david@megginson.com>: > Actually, I don't see any strong argument not to provide empty inline > implementations for the handler callbacks: Inlined virtuals will cause an instantiation of the vtable and the function bodies in _every_ compilation unit the header file is included into (ref. Scott Meyers "More Effective C++", Item 24 pp 118). This is a size cost that can be easily avoided. I'm also coming more and more to the conclusion that even trivial non-virtual function bodies should not be inlined, _unless_ there is a clear performance reason to do so. This is because even trivial inlined function occasinally needs to be changed and changing something in a headerfile causes recompilation. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 08:46:24 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:22 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: Paul Miller's message of "Sun, 05 Dec 1999 20:07:49 -0500" References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com> Message-ID: <wh66yco3s5.fsf@viffer.oslo.metis.no> >>>>> Paul Miller <stele@fxtech.com>: > I should have been more clear. I just want to use XML for simple > non-web-bound application data files (document files). I need a > non-validating parser that I can use to efficiently parse my > application data, without all the complexity (and overhead) of > something like DOM, but not as general-purpose as expat. What I did in a similar situation, was to take James Clark's expat http://www.jclark.com/xml/expat.html and wrap it in a SAXoid interface. Today I would have done the same thing with James Clarks modified version of David Megginsons proposal instead of my own reinterpretation of the Java SAX into C++ (which is what I now have). Alternatively you can take a look at Xerces-C from the Apache consortium: http://xml.apache.org/xerces-c/index.html It has its own SAX interface. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ssahuc at imediation.com Mon Dec 6 09:03:42 1999 From: ssahuc at imediation.com (Sebastien Sahuc) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++ vs. SAX2 Message-ID: <C10B7E3A3AC3D211804E0000B45EDA84404FC5@mail.imediation.com> > Michael Fuller wrote : > But having taken a look at SAX2, not much seems to be wrong with it. > *That* trend needs to be stepped on, and quickly, before it > gets out of hand. Completely agree with it. And why not focus on SAX2 for both Java and C++ as someone has already pointed out ? Sebastien > > Michael > ____________________________________________ > http://www.mds.rmit.edu.au/~msf/ > Multimedia Databases Group, RMIT, Australia. > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: > http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 09:06:35 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:22 2004 Subject: SAX/C++: First interface draft In-Reply-To: Steinar Bang's message of "03 Dec 1999 14:14:54 +0100" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whu2m0qi7l.fsf@viffer.oslo.metis.no> Message-ID: <wh1z90o2ui.fsf@viffer.oslo.metis.no> >>>>> Steinar Bang <sb@metis.no>: >>>>> James Clark <jjc@jclark.com>: >> - Would it be better to typedef SAXString to the Standard C++ string >> class (ie std::basic_string<SAXChar>)? > An argument for using > typdef const SAXChar* SAXString; > is that you get late construction of the basic_string<>, ie. you don't > create it until you have to (eg. when using it to do a lookup in an > STL map<>). After thinking over the weekend, I'm changing my vote on this issue. I think the convenience of using basic_string<> way outweighs the cost advantages of lazy conversion, since in most cases the first thing that would be done in the DocumentHandler (or whereever) would be to create a basic_string<> for the appropriate character size anyway. But the UTF-16 string should not be a straight typedef. We should derive from basic_string<SAXChar> to get a char* constructor that would take a UTF-8-encoded string. This is for ease of use with character constants. Hm... we may also need an operator<<() for byte streams, that would do UTF-8 encoding...? (fewer implementations have templated streams than have basic_string<>, and we may want to use a byte stream rather than a wide stream for I/O anyway.) that would make the SAX.h file something like this: Here's SAX.h: #ifndef __SAX_HXX #define __SAX_HXX // Forward declarations of std::istream #include <iosfwd> namespace SAX_UTF8 { typedef char SAXChar; typedef std::string SAXString; #include "SAXDecl.h" } namespace SAX_UTF16 { typedef unsigned short SAXChar; class SAXString : public std::basic_string<SAXChar> { public: SAXString(const char* utf8); }; ostream& operator::<<(ostream&,const SAXString&); #include "SAXDecl.h" } #endif ..or something... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 09:09:46 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: First interface draft In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <whwvqsmo4v.fsf@viffer.oslo.metis.no> >>>>> James Clark <jjc@jclark.com>: > - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8 > and a SAX_UTF16 namespace (since you're using std::istream, you are > assuming compiler support for namespaces); this will work nicely with > namespace aliases (eg namespace SAX = SAX_UTF8). I have a practical problem with using std::istream on the MSVC++ platform. Since the Standard C++ Library as delivered with MSVC++ 5 and 6 is broken, we're using Standards<ToolKit> from ObjectSpace to provide us with the parts of the Standard C++ Library we're using. And Objectspace Standards<ToolKit> is not compatible with the Standard C++ Library iostreams of MSVC++. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Mon Dec 6 09:14:21 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:18:23 2004 Subject: SAX2/C++ interface draft [Was: Re: Request for Discussion: SAX 1.0 in C++] In-Reply-To: <006501bf3dca$a4767ce0$4a5eedc1@arp01>; from Richard Anderson on Fri, Dec 03, 1999 at 08:11:53PM -0000 References: <14406.58446.675568.388482@localhost.localdomain><199912031808.LAA14313@localhost.localdomain> <14408.2087.817611.250771@localhost.localdomain> <006501bf3dca$a4767ce0$4a5eedc1@arp01> Message-ID: <19991206201359.G3543@io.mds.rmit.edu.au> On Fri, Dec 03, 1999 at 08:11:53PM -0000, Richard Anderson wrote: > How about focusing on SAX/2, and making the first C/C++ SAX interface > actually SAX 2 so we kill two birds with one stone ? Good idea! Here's a [hasty] conversion of the SAX2 Java classes into C++, along the lines of David and James' posted SAX 1.0/C++ headers I've used exception specifiers for consistency with the original Java. // SAX2Decl.h namespace SAX { class Configurable; class DTDHandler; class DocumentHandler; class EntityResolver; class ErrorHandler; class InputSource; class Parser; class SAXException; class Configurable { public: virtual void setFeature(SAXString featureId, bool state) throw(SAXException) = 0; virtual bool getFeature(SAXString featureId) throw(SAXException) = 0; virtual void setProperty(SAXString propertyId, void * value) throw(SAXException) = 0; virtual void * getProperty(SAXString propertyId) throw(SAXException) = 0; private: void operator delete (void *); } class ConfigurableParserAdapter : public Parser, public Configurable { public ConfigurableParserAdapter(Parser& parser); // SAX 1.0 methods void setLocale(const char * locale) throw(SAXException); void setEntityResolver(EntityResolver& resolver) void setDTDHandler(DTDHandler& handler) void setDocumentHandler(DocumentHandler& handler) void setErrorHandler(ErrorHandler& handler) void parse(const InputSource& source) throw(SAXException); void parse(SAXString systemId) throw(SAXException); // SAX2 methods void setFeature(SAXString featureId, bool state) throw(SAXException); bool getFeature(SAXString featureId) throw(SAXException); void setProperty(SAXString propertyId, void * value) throw(SAXException); void * getProperty(SAXString propertyId) throw(SAXException); private: void operator delete (void *); } class DeclHandler { public: virtual void elementDecl(SAXString name, SAXString model) throw(SAXException) = 0; virtual void attributeDecl(SAXString eName, SAXString aName, SAXString type, SAXString valueDefault, SAXString value) throw(SAXException) = 0; virtual void internalEntityDecl(SAXString name, SAXString value) throw(SAXException) = 0; virtual void externalEntityDecl(SAXString name, SAXString publicId, SAXString systemId) throw(SAXException) = 0; private: void operator delete (void *); } class LexicalHandler { public: virtual void startDTD(SAXString name, SAXString publicId, SAXString systemId) throw(SAXException) = 0; virtual void endDTD() throw(SAXException) = 0; virtual void startEntity(SAXString name) throw(SAXException) = 0; virtual void endEntity(SAXString name) throw(SAXException) = 0; virtual void startCDATA() throw(SAXException) = 0; virtual void endCDATA() throw(SAXException) = 0; virtual void comment(const SAXChar * ch, int length) throw(SAXException) = 0; private: void operator delete (void *); } class NamespaceHandler { public: virtual void startNamespaceDeclScope(SAXString prefix, SAXString uri) throw(SAXException) = 0; virtual void endNamespaceDeclScope(SAXString prefix) throw(SAXException) = 0; private: void operator delete (void *); } class SAXNotRecognizedException : public SAXException { public: SAXNotRecognizedException(SAXString message); private: void operator delete (void *); } class SAXNotSupportedException : public SAXException { public: SAXNotSupportedException(SAXString message); private: void operator delete (void *); } } Michael -- http://www.mds.rmit.edu.au/~msf/ Multimedia Databases Group, RMIT, Australia. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 09:25:51 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> Message-ID: <m3yab8qv35.fsf@ifi.uio.no> * Walter Underwood | | Comments are welcome. First thought: this is fine for very simple uses, but for more complex uses something along the lines of the robots.txt file would be very nice. How about a variant PI that can point to a robots.rdf resource? Second thought: "and the index attribute must be first". This is nice for implementors, but is likely to clash with the expectations of users and the cost of more generality is very low for implementors. Why not follow the <URL: http://www.w3.org/TR/xml-stylesheet/ > style of specifying PI pseudo-attributes? Also: The robot PI, says the spec, "should be in the internal subset (not in an external DTD or parameter entity). Since robots may be non-validating, a robots PI in the external subset might not be seen by the robot." I think this is misleading, since "the internal subset" is usually a short for "the internal DTD subset". A better way of putting it might be "It should be in the document entity (not in an external entity, including the external DTD subset and external parameter entities). Since robots may skip external entities, PIs in external entities might not be seen by the robot." However, I don't think this will do either. Entities are what the storage structure of SGML/XML documents are composed of, and I think this spec needs to take some sort of stand as to how entities map to WWW resources, and which entities the PI is really talking about. One way is to say that every resource is an entity, and every web-accessible entity is a resource. Then one might say that the robots PI refers to a) the entity in which it is found b) the entity in which it is found and all entities included by this entity via entity references, regardless of any robots PIs in these included entities c) the entity in which it is found, and if "follow" is set to yes, all entities included by this entity via entity references, regardless of any robots PIs in these included entities d) the entity in which it is found, and if "sub-entities" is set to yes, all entities included by this entity via entity references, regardless of any robots PIs in these included entities Once one agrees on a policy I think this is worth a subsection in the spec, regardless of the choice made. b) is probably the easiest to implement, since many APIs do not expose entity structure. It might not be the best choice, though. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 09:31:29 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: A processing instruction for robots In-Reply-To: <001201bf3e1f$074601c0$099918d1@docuverse1> References: <001201bf3e1f$074601c0$099918d1@docuverse1> Message-ID: <m3wvqsqutu.fsf@ifi.uio.no> * Don Park | | Walter, | Could you elaborate your decision to use PI rather than element(s)? I'm not Walter, but to me this has the obvious advantage that it can be used completely orthogonally to the document contents and the software used to process the document for non-indexing purposes. Of course, it works poorly with SML, and IMHO this (and the "Associating stylesheets with XML documents" recommendation) are good arguments for including PIs in SML, even if only before the document element. No doubt there will be other proposals of this sort, and if these are all specified in terms of elements writing application-specific processing software will be hell unless we either start using architectures or mandate the use of namespaces in processing. And even then it might still be hell for various reasons, especially the namespace solution. So IMHO PIs are the right choice for this. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 09:33:50 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <008301bf3d41$de0c9b80$a82a08d1@tomshp> References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> Message-ID: <m3vh6cqupx.fsf@ifi.uio.no> * David Megginson | | Sure -- is there a strong need for a common C interface, though? We | already have Expat's C interface, and I don't know of anyone else in | that space yet. * Thomas B. Passin | | But C is available on most _any_ platform - often for free. So | almost anyone could compile in C but not necessarily in C++. Isn't | rxp done in C? It is, and I think libxml is also. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 09:36:46 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <001301bf3ec0$d47d7d20$5afbb1cd@tomshp> References: <199912042112.OAA19167@localhost.localdomain> <001301bf3ec0$d47d7d20$5afbb1cd@tomshp> Message-ID: <m3u2lwqukx.fsf@ifi.uio.no> * Thomas B. Passin | | I'd second this, with the thought that most of the effort would | concentrate on finishing the SAX2 interface itself before spending | the potential "months" on the C++/SAX implementation. I agree with this as well. If SAX2 is as close to finished as I think it is, it really should be finished off now, to be followed by a C++ translation of SAX 2.0. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 09:43:53 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <14408.2610.245842.199581@localhost.localdomain> References: <14408.2610.245842.199581@localhost.localdomain> Message-ID: <m3so1gqu92.fsf@ifi.uio.no> * David Megginson | | I'd like to hear what others think on this issue. My feeling is that SAX2 is very important (namespaces, generalized querying and extensibility and lexical information are all very important to some kinds of applications), and furthermore so close to completion that it should go before a C++ binding, which might take a long time to complete. The added benefit is of course that the C++ bindings for SAX 1.0 and 2.0 can be done simultaneously. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nisse at lysator.liu.se Mon Dec 6 09:54:38 1999 From: nisse at lysator.liu.se (Niels Möller) Date: Mon Jun 7 17:18:23 2004 Subject: parser asynch input (Was: SAX/C++: First interface draft) In-Reply-To: Steinar Bang's message of "03 Dec 1999 14:20:35 +0100" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no> Message-ID: <nnyab8cs38.fsf@sanna.lysator.liu.se> Steinar Bang <sb@metis.no> writes: > I would like to add operations that can be used to "push" data to the > parser asynchronously: I also think this is important (and it's a sadly missing feature of the IBM's xml4c parser, which provides another SAX-like C++ API). But is there any reason not to use the same InputSource abstraction for the fragment blocks? Say, something like class Parser { public: virtual void setLocale (const char *) = 0; virtual void setEntityResolver (EntityResolver &resolver) = 0; virtual void setDTDHandler (DTDHandler &handler) = 0; virtual void setDocumentHandler (DocumentHandler &handler) = 0; virtual void setErrorHandler (ErrorHandler &handler) = 0; virtual void parse (SAXString systemId) = 0; virtual void parse (const InputSource &input) = 0; ! virtual void parseFragment (const InputSource &input) = 0; ! virtual void parseEnd() = 0; private: void operator delete (void *); }; The idea is that a document is the catenation of one or more fragments (e.g. blocks of data that are read from a socket). parse(source) would be equivalent to parseFragment(source); parseEnd(). /Niels xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vilya at nag.co.uk Mon Dec 6 10:12:25 1999 From: vilya at nag.co.uk (Vilya Harvey) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++ vs. SAX2 References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> Message-ID: <384B8C32.22805488@nag.co.uk> Sebastien Sahuc wrote: > > > Michael Fuller wrote : > > But having taken a look at SAX2, not much seems to be wrong with it. > > *That* trend needs to be stepped on, and quickly, before it > > gets out of hand. > > Completely agree with it. And why not focus on SAX2 for both Java and > C++ as someone has already pointed out ? Just a thought: why not take a leaf out of the DOM's book and write the canonical version of the SAX interfaces in a language-neutral format like IDL? That way, bindings to a number of languages (including, but not limited to, C++ and Java) can be trivially derived by using the appropriate IDL-to-whatever converter. Vil. -- Vilya Harvey <vilya@nag.co.uk> Wilkinson House Mob: +44 961 106 505 Computational Mathematics Group Jordan Hill Road Wk: +44 1865 511 245 NAG Limited Oxford UK OX2 8DR Fax: +44 1865 311 205 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 10:25:44 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:23 2004 Subject: parser asynch input (Was: SAX/C++: First interface draft) In-Reply-To: nisse@lysator.liu.se's message of "06 Dec 1999 10:54:19 +0100" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no> <nnyab8cs38.fsf@sanna.lysator.liu.se> Message-ID: <wh7lismkmb.fsf@viffer.oslo.metis.no> >>>>> nisse@lysator.liu.se (Niels M?ller): > Steinar Bang <sb@metis.no> writes: >> I would like to add operations that can be used to "push" data to the >> parser asynchronously: > I also think this is important (and it's a sadly missing feature of > the IBM's xml4c parser, which provides another SAX-like C++ API). > But is there any reason not to use the same InputSource abstraction > for the fragment blocks? Say, something like > class Parser > { > public: [snip!] > ! virtual void parseFragment (const InputSource &input) = 0; > ! virtual void parseEnd() = 0; > private: > void operator delete (void *); > }; Hm... I think for me at least, this will cause an extra copy of the fragment before parsing. If I have a buffer, and wrap an strstream around it, I would still need to read the entire fragment from the istream into another buffer before feeding it to expat. Or would it be more efficient to do a loop on the stream and put the buffer's contents char by char into expat...? Being able to put a buffer directly into the parser is the most efficient way of doing things, from the way we currently handle different file formats in our application. We have a map from MIME types to pointers to instances of a class called NetStreamFactory: class NetStreamFactory { public: virtual ~NetStreamFactory(); virtual NetStream* newStream(const Url* url = 0) = 0; }; not surprisingly, these factories are used to create instances of subclasses of NetStream (subclasses handling XML, and our old file format, as well as decoding image formats like PNG and JPEG): class NetStream { public: virtual ~NetStream(); virtual void setReadOnly(bool readOnly = true); virtual void putBlock(const char* buf, unsigned long len, bool entireFile = false) = 0; virtual void eof(); }; (The idea with the "entireFile" argument to putBlock, is that I can avoid doing buffering of the data for the NetStream classes that need the entire file (our old format which uses a recursive descent parser, and our current JPEG decoder) for the case where I'm reading in the file from the local file system. Also for the case of data arriving on the net, I'm delivering the buffer read from the network as is, to the XML parser, without doing an extra copy). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john.aldridge at informatix.co.uk Mon Dec 6 11:30:43 1999 From: john.aldridge at informatix.co.uk (John Aldridge) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: First interface draft In-Reply-To: <wh1z90o2ui.fsf@viffer.oslo.metis.no> References: <Steinar Bang's message of "03 Dec 1999 14:14:54 +0100"> <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whu2m0qi7l.fsf@viffer.oslo.metis.no> Message-ID: <3.0.6.32.19991206113002.00ad23f0@mailhost> At 10:06 06/12/99 +0100, Steinar Bang <sb@metis.no> wrote: >After thinking over the weekend, I'm changing my vote on this issue. >I think the convenience of using basic_string<> way outweighs the >cost advantages of lazy conversion, Agreed. >But the UTF-16 string should not be a straight typedef. We should >derive from basic_string<SAXChar> to get a char* constructor that >would take a UTF-8-encoded string. This is for ease of use with >character constants. I disagree with this, though. It's not that much of a hardship to add an "L" before your character constants; certainly not enough to warrant subclassing a class without a virtual dtor (and duplicating all the constructors and all the functions which return a string as a function result). I think I'd go for using straight std::wstring, and define the characters in those wstrings to be UTF-16 encoded (whatever the current C locale says). Leave it up to the application either to set a locale in which wchar_t is UTF-16 (in which case the RTL functions will behave sensibly), or not (in which case the application will have to hand crank some things). -- Cheers, John xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Dec 6 11:40:40 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: UTF-8 v UTF-16 In-Reply-To: <19991206181830.A11576@io.mds.rmit.edu.au> References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <19991206181830.A11576@io.mds.rmit.edu.au> Message-ID: <14411.41098.673804.538770@localhost.localdomain> Michael Fuller writes: > James Clark wrote: > > David Megginson wrote: > > > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility > > > with most existing C++ code. > > I would say there was at least as much C++ code using UTF-16 as using UTF-8. > [...] > > There are a couple of possible solutions: > > > > 1. A lo-tech solution. Provide a SAXChar typedef [...] > > > > 2. A hi-tech solution. [use templates] > > 3. Use a similar solution to the Java spec: provide both a ByteStream > and a CharacterStream in InputSource, which has two benefits. Unfortunately, the problem is not input from the document but output to the client. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Mon Dec 6 13:09:39 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:23 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com> Message-ID: <001f01bf3feb$b919bf40$a62a08d1@tomshp> Original : From: Paul Miller <stele@fxtech.com> > ... > I should have been more clear. I just want to use XML for simple > non-web-bound application data files (document files). I need a > non-validating parser that I can use to efficiently parse my application > data, without all the complexity (and overhead) of something like DOM, > but not as general-purpose as expat. > ... I had a task to convert simple spreadsheet data - each row was complete in itself - to xml. I saved the spreadsheet as a tab-delimited file, converted it to XML using awk, and did the simple processing I needed to do using regular expressions in Python. It was quick and easy. Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 13:33:29 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: <384B8C32.22805488@nag.co.uk> References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk> Message-ID: <m31z90p554.fsf@ifi.uio.no> * Vilya Harvey | | Just a thought: why not take a leaf out of the DOM's book and write | the canonical version of the SAX interfaces in a language-neutral | format like IDL? This may sound like a good idea, but it has its drawbacks in that one is immediately forced into a lowest common denominator design where it is impossible to make use of the features that really make each language what they are. Also, IDL does not have convenient ways of mapping to C++ streams, Java InputStream, Python dictionary-like objects and file-like objects etc etc Another problem is that exceptions are first-class objects in SAX (which is exploited by the Java and Python mappings), but not in IDL. Nor are language naming conventions respected. (startElement should really be startElement (in Java), start_element (in C++, Python, IDL) and start-element (in Common Lisp/Scheme) and there may even be more variations. As a general reference and statement of intent it might have some value, but I really think translation should be done by humans. The main advantage feature of IDL, cross-process and cross-language interoperability, is not really all that valuable for SAX anyway. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Dec 6 13:41:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: First interface draft In-Reply-To: Steinar Bang's message of "06 Dec 1999 09:34:57 +0100" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain> <whaenoo4b2.fsf@viffer.oslo.metis.no> Message-ID: <m33dtgp4qa.fsf@localhost.localdomain> Steinar Bang <sb@metis.no> writes: > >>>>> David Megginson <david@megginson.com>: > > > Actually, I don't see any strong argument not to provide empty inline > > implementations for the handler callbacks: > > Inlined virtuals will cause an instantiation of the vtable and the > function bodies in _every_ compilation unit the header file is > included into (ref. Scott Meyers "More Effective C++", Item 24 pp > 118). > > This is a size cost that can be easily avoided. Thanks -- this is the kind of information I was looking for. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Dec 6 13:43:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: First interface draft In-Reply-To: Steinar Bang's message of "06 Dec 1999 10:09:36 +0100" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no> Message-ID: <m3zovonq1i.fsf@localhost.localdomain> Steinar Bang <sb@metis.no> writes: > And Objectspace Standards<ToolKit> is not compatible with the Standard > C++ Library iostreams of MSVC++. I think that this goes beyond the scope of SAX -- we have to be able to assume at least a basic level of ANSI-C++ conformance, or else we'll end up rewriting the whole standard library. I'm willing not to beat up on the hairier features (like templates), but we have to be able to count on the basics. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Dec 6 13:51:12 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:23 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: Paul Miller's message of "Sun, 05 Dec 1999 19:35:38 -0500" References: <384B04DA.DCD6BAED@fxtech.com> Message-ID: <m3wvqsnppb.fsf@localhost.localdomain> Paul Miller <stele@fxtech.com> writes: > I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the > solutions I've seen are very simple or straightforward for generic > application data I/O (ie. non web, e-commerce, Java-type stuff). In > other words, I'm about to roll my own, and would like to gauge interest > in a small callback-based API for simple XML I/O. We tried to keep SAX 1.0 as simple as possible -- how would you simplify the following further? public static void main () { Parser parser = new SomeSAXDriver(); parser.setDocumentHandler(new MyHandler()); try { parser.parse("http://www.foo.com/foo.xml"); } catch (SAXException e) { // do something!! } } and public class MyHandler extends HandlerBase { public void startElement (String name, AttributeList atts) { // do something!! } public void endElement (String name) { // do something!! } public void characters (char ch[], int start, int length) { // do something!! } } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 13:59:49 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: UTF-8 v UTF-16 In-Reply-To: <whyabcqijv.fsf@viffer.oslo.metis.no> References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com> <whyabcqijv.fsf@viffer.oslo.metis.no> Message-ID: <m3emd0npa6.fsf@ifi.uio.no> * James Clark | | Unfortunately wchar_t isn't guaranteed to be UTF-16. Some platforms | make it 32-bits. * Steinar Bang | | Yep! So I've heard. | | Do you have a list of the ones that does this? gcc 2.95 on Linux does, at least. I don't know what it does on other platforms. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 14:06:36 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: Changes for C++ In-Reply-To: <wh7liwthnr.fsf@viffer.oslo.metis.no> References: <14406.59075.218048.437305@localhost.localdomain> <wh7liwthnr.fsf@viffer.oslo.metis.no> Message-ID: <m3d7sknoym.fsf@ifi.uio.no> * Steinar Bang | | I would like to be able to create a "push" stream, ie. something | similar to a libwww stream, where data that arrives asynchronously | will just be "pushed" to the parser as they arrive. | | expat already supports this, and I use it. We added support for this as an extension in the Python version of SAX, since several of the Python parsers support this (xmllib, xmlproc and pyexpat). This was simply done by adding three methods on the extended parser interface: reset, feed and close. For C++ SAX2 this might be done through a property (http://.../push-stream) which returns a PushStream implementation with these three methods to allow you to push data into parsers which support this. Some means of specifying the URL of the document entity is probably also a good idea, for resolution of relative URLs. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 14:11:43 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:23 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> Message-ID: <384BC3E2.E526B322@fxtech.com> > We tried to keep SAX 1.0 as simple as possible -- how would you > simplify the following further? > > public void startElement (String name, AttributeList atts) > { > // do something!! > } Here is where I have the problem. This leaves an awful lot up to the application, still, including handling the proper nesting. I would like to make the actual parsing of elements more "automatic", so when a certain element is hit, it calls a function with my object pointer where I can pick up the parsing from there, then drop back out to the enclosing XML scope and keep going. Perhaps what I want to do should be built on SAX instead of expat, though. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eoin_lane at esatclear.ie Mon Dec 6 14:34:22 1999 From: eoin_lane at esatclear.ie (Eoin Lane) Date: Mon Jun 7 17:18:23 2004 Subject: psgml 1.2.1 problem Message-ID: <384BC90B.8BA224A4@esatclear.ie> I'm trying to write a xml doc with emacs configured to use psgml-1.2.1 but am having some problems. I have checked that psgml works with a simple dtd. However when I use the dtd (document-v10.dtd) below I get the following error. ~/character.ent line 2 col 12 entity common.att ~/document-v10.dtd line 218 col 29 entity DOCUMENT ~/installing.xml line 3 col 51 Name expected; at: :lang I wonder could anyone tell me what I am doing wrong. I know the dtd is correct because I checked it with IBM 4j parser and it validated. it would be of great benefit to me if I could use the dtd in emacs so any help would be greatly appreciated. Eoin. -- Dr. Eoin Lane InConn Technologies Ltd. 17 Washington St. Cork. Tel. (021) 271855 Fax (021) 272419 http://inconn.ucc.ie mailto:eoinlane@esatclear.ie -------------- next part -------------- <!-- =================================================================== Apache Documentation DTD (Version 1.0) PURPOSE: This DTD was developed to create a simple yet powerful document type for software documentation for use with the Apache projects. It is an XML-compliant DTD and it's maintained by the Apache XML project. TYPICAL INVOCATION: <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation Vx.yz//EN" "http://xml.apache.org/DTD/document-vxyz.dtd"> where x := major version y := minor version z := status identifier (optional) NOTES: Many of the design patterns used in this DTD were take from the W3C XML Specification DTD edited by Eve Maler <elm@arbortext.com>. Where possible, great care has been used to reutilize HTML tag names to reduce learning efforts and to allow HTML editors to be used for complex authorings like tables and lists. AUTHORS: Stefano Mazzocchi <stefano@apache.org> FIXME: - how can we include char entities without hardwiring them? - should "form" tags be included? - should all style-free HTML 4.0 markup tags be included? - how do we handle the idea of "soft" xlinks? - should we add "soft" links to images? CHANGE HISTORY: 19991121 Initial version. (SM) 19991123 Replaced "res" with more standard "strong" for emphasis. (SM) 19991124 Added "fork" element for window forking behavior. (SM) 19991124 Added "img-inline" element to separate from "img". (SM) 19991129 Removed "affiliation" from "author". (SM) 19991129 Made "author" empty and moved "name|email" as attributes (SM) COPYRIGHT: Copyright (c) 1999 The Apache Software Foundation. Permission to copy in any form is granted provided this notice is included in all copies. Permission to redistribute is granted provided this file is distributed untouched in all its parts and included files. ==================================================================== --> <!-- =============================================================== --> <!-- Common character entities (included from external file) --> <!-- =============================================================== --> <!-- FIXME (SM): this is hardcoding. Find a better way of doing this possibly using public identifiers of ISO latin char sets --> <!ENTITY % charEntity SYSTEM "characters.ent"> %charEntity; <!-- =============================================================== --> <!-- Userful entitieis for increased DTD readability --> <!-- =============================================================== --> <!ENTITY % text "#PCDATA"> <!-- =============================================================== --> <!-- Entities for general XML compliance --> <!-- =============================================================== --> <!-- Common attributes Every element has an ID attribute (sometimes required, but usually optional) for links, and a Role attribute for extending the useful life of the DTD by allowing authors to make subclasses for any element. %common.att; is for common attributes where the ID is optional, and %common-idreq.att; is for common attributes where the ID is required. --> <!ENTITY % common.att 'id ID #IMPLIED xml:lang NMTOKEN #IMPLIED role NMTOKEN #IMPLIED'> <!ENTITY % common-idreq.att 'id ID #REQUIRED xml:lang NMTOKEN #IMPLIED role NMTOKEN #IMPLIED'> <!-- xml:space attribute =============================================== Indicates that the element contains white space that the formatter or other application should retain, as appropriate to its function. ==================================================================== --> <!ENTITY % xmlspace.att 'xml:space (default|preserve) #FIXED "preserve"'> <!-- def attribute ===================================================== Points to the element where the relevant definition can be found, using the IDREF mechanism. %def.att; is for optional def attributes, and %def-req.att; is for required def attributes. ==================================================================== --> <!ENTITY % def.att 'def IDREF #IMPLIED'> <!ENTITY % def-req.att 'def IDREF #REQUIRED'> <!-- ref attribute ===================================================== Points to the element where more information can be found, using the IDREF mechanism. %ref.att; is for optional ref attributes, and %ref-req.att; is for required ref attributes. ================================================================== --> <!ENTITY % ref.att 'ref IDREF #IMPLIED'> <!ENTITY % ref-req.att 'ref IDREF #REQUIRED'> <!-- =============================================================== --> <!-- Entities for XLink compliance --> <!-- =============================================================== --> <!ENTITY % xlink-simple.att 'type (simple|extended|locator|arc) #FIXED "simple" href CDATA #IMPLIED role CDATA #IMPLIED title CDATA #IMPLIED '> <!-- 'xmlns CDATA #FIXED "http://www.w3.org/XML/XLink/0.9" --> <!-- FIXME: brain-dead IE5 has broken support for namespace validation and since I use it for editing I remove this for now --> <!ENTITY % xlink-user-replace.att 'show (new|parsed|replace) #FIXED "replace" actuate (user|auto) #FIXED "user" '> <!ENTITY % xlink-user-new.att 'show (new|parsed|replace) #FIXED "new" actuate (user|auto) #FIXED "user" '> <!ENTITY % xlink-auto-parsed.att 'show (new|parsed|replace) #FIXED "parsed" actuate (user|auto) #FIXED "auto" '> <!-- FIXME (SM): XLink doesn't yet cover the idea of soft links so introducing it here using the same namespace is _somewhat_ illegal. Should we create it own namespace? --> <!ENTITY % xlink-soft.att 'mode (hard|soft) #FIXED "soft" '> <!-- =============================================================== --> <!-- Entities for general usage --> <!-- =============================================================== --> <!-- Key attribute ===================================================== Optionally provides a sorting or indexing key, for cases when the element content is inappropriate for this purpose. ==================================================================== --> <!ENTITY % key.att 'key CDATA #IMPLIED'> <!-- Title attributes ================================================== Indicates that the element requires to have a title. ==================================================================== --> <!ENTITY % title.att 'title CDATA #REQUIRED'> <!-- Name attributes ================================================== Indicates that the element requires to have a name. ==================================================================== --> <!ENTITY % name.att 'name CDATA #REQUIRED'> <!-- Email attributes ================================================== Indicates that the element requires to have an email. ==================================================================== --> <!ENTITY % email.att 'email CDATA #REQUIRED'> <!-- =============================================================== --> <!-- General definitions --> <!-- =============================================================== --> <!-- A person is a general human entity --> <!ELEMENT person EMPTY> <!ATTLIST person %common.att; %name.att; %email.att;> <!-- =============================================================== --> <!-- Content definitions --> <!-- =============================================================== --> <!ENTITY % local.content.mix ""> <!ENTITY % markup "strong|em|code|sub|sup"> <!ENTITY % links "link|connect|jump|fork|anchor"> <!ENTITY % special "br|img"> <!ENTITY % link-content.mix "%text;|%markup;|%special;%local.content.mix;"> <!ENTITY % content.mix "%link-content.mix;|%links;"> <!-- ==================================================== --> <!-- Phrase Markup --> <!-- ==================================================== --> <!-- Strong (typically bold) --> <!ELEMENT strong (%text;)> <!ATTLIST strong %common.att;> <!-- Emphasis (typically italic) --> <!ELEMENT em (%text;)> <!ATTLIST em %common.att;> <!-- Code (typically monospaced) --> <!ELEMENT code (%text;)> <!ATTLIST code %common.att;> <!-- Superscript (typically smaller and higher) --> <!ELEMENT sup (%text;)> <!ATTLIST sup %common.att;> <!-- Subscript (typically smaller and lower) --> <!ELEMENT sub (%text;)> <!ATTLIST sub %common.att;> <!-- FIXME (SM): should we add these HTML 4.0 markups which are style-free? -dfn -samp -kbd -var -cite -abbr -acronym --> <!-- ==================================================== --> <!-- Hypertextual Links --> <!-- ==================================================== --> <!-- hard replacing link (equivalent of <a ...>) --> <!ELEMENT link (%link-content.mix;)*> <!ATTLIST link %common.att; %xlink-simple.att; %xlink-user-replace.att;> <!-- Hard window replacing link (equivalent of <a ... target="_top">) --> <!ELEMENT jump (%link-content.mix;)*> <!ATTLIST jump %common.att; %xlink-simple.att; %xlink-user-new.att;> <!-- Hard window forking link (equivalent of <a ... target="_new">) --> <!ELEMENT fork (%link-content.mix;)*> <!ATTLIST fork %common.att; %xlink-simple.att; %xlink-user-new.att;> <!-- Anchor point (equivalent of <a name="...">) --> <!ELEMENT anchor EMPTY> <!ATTLIST anchor %common-idreq.att;> <!-- Soft link between processed pages (no equivalent in HTML) --> <!ELEMENT connect (%link-content.mix;)*> <!ATTLIST connect %common.att; %xlink-simple.att; %xlink-user-replace.att; %xlink-soft.att;> <!-- ==================================================== --> <!-- Specials --> <!-- ==================================================== --> <!-- Breakline Object (typically forces line break) --> <!ELEMENT br EMPTY> <!ATTLIST br %common.att;> <!-- Image Object (typically an inlined image) --> <!-- FIXME (SM): should we have the notion of soft links even here for inlined objects? --> <!ELEMENT img EMPTY> <!ATTLIST img src CDATA #REQUIRED alt CDATA #REQUIRED height CDATA #IMPLIED width CDATA #IMPLIED usemap CDATA #IMPLIED ismap (ismap) #IMPLIED %common.att;> <!-- =============================================================== --> <!-- Blocks definitions --> <!-- =============================================================== --> <!ENTITY % local.blocks ""> <!ENTITY % local.lists ""> <!ENTITY % paragraphs "p|source|note|fixme|img-block"> <!ENTITY % tables "table"> <!ENTITY % lists "ol|ul|sl|dl %local.lists;"> <!ENTITY % blocks "%paragraphs;|%tables;|%lists; %local.blocks;"> <!-- ==================================================== --> <!-- Paragraphs --> <!-- ==================================================== --> <!-- Text Paragraph (normally vertically space delimited) --> <!ELEMENT p (%content.mix;)*> <!ATTLIST p %common.att;> <!-- Source Paragraph (normally space is preserved) --> <!ELEMENT source (%content.mix;)*> <!ATTLIST source %common.att; %xmlspace.att;> <!-- Note Paragraph (normally shown encapsulated) --> <!ELEMENT note (%content.mix;)*> <!ATTLIST note %common.att;> <!-- Fixme Paragraph (normally not shown) --> <!ELEMENT fixme (%content.mix;)*> <!-- the "author" attribute should match the "key" attribute of the <author> element --> <!ATTLIST fixme author CDATA #REQUIRED %common.att;> <!-- ==================================================== --> <!-- Tables --> <!-- ==================================================== --> <!ENTITY % cellhalign.att 'align (left|center |right|justify |char) #IMPLIED char CDATA #IMPLIED charoff CDATA #IMPLIED'> <!ENTITY % cellvalign.att 'valign (top|middle |bottom |baseline) #IMPLIED'> <!ENTITY % thtd.att 'abbr CDATA #IMPLIED axis CDATA #IMPLIED headers IDREFS #IMPLIED scope (row |col |rowgroup |colgroup) #IMPLIED rowspan NMTOKEN "1" colspan NMTOKEN "1"'> <!ENTITY % width.att 'width CDATA #IMPLIED'> <!ENTITY % span.att 'span NMTOKEN "1"'> <!-- Table (based on the IETF HTML table standard [RFC1942]) --> <!ELEMENT table (caption?, (col*|colgroup*), thead?, tfoot?, tbody+)> <!ATTLIST table %common.att; %width.att; summary CDATA #IMPLIED border CDATA #IMPLIED frame (void|above |below|hsides |lhs|rhs |vsides|box |border) #IMPLIED rules (none|groups |rows|cols |all) #IMPLIED cellspacing CDATA #IMPLIED cellpadding CDATA #IMPLIED> <!ELEMENT caption (%content.mix;)*> <!ATTLIST caption %common.att;> <!ELEMENT colgroup (col)*> <!ATTLIST colgroup %common.att; %span.att; %width.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT col EMPTY> <!ATTLIST col %common.att; %span.att; %width.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT thead (tr)+> <!ATTLIST thead %common.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT tfoot (tr)+> <!ATTLIST tfoot %common.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT tbody (tr)+> <!ATTLIST tbody %common.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT tr (th|td)+> <!ATTLIST tr %common.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT th (%content.mix;)*> <!ATTLIST th %common.att; %thtd.att; %cellhalign.att; %cellvalign.att;> <!ELEMENT td (%content.mix;)*> <!ATTLIST td %common.att; %thtd.att; %cellhalign.att; %cellvalign.att;> <!-- ==================================================== --> <!-- Lists --> <!-- ==================================================== --> <!-- Unordered list (typically bulleted) --> <!ELEMENT ul (li|%lists;)+> <!-- spacing attribute: Use "normal" to get normal vertical spacing for items; use "compact" to get less spacing. The default is dependent on the stylesheet. --> <!ATTLIST ul %common.att; spacing (normal|compact) #IMPLIED> <!-- Ordered list (typically numbered) --> <!ELEMENT ol (li|%lists;)+> <!-- spacing attribute: Use "normal" to get normal vertical spacing for items; use "compact" to get less spacing. The default is dependent on the stylesheet. --> <!ATTLIST ol %common.att; spacing (normal|compact) #IMPLIED> <!-- Simple list (typically with no mark) --> <!ELEMENT sl (li|%lists;)+> <!ATTLIST sl %common.att;> <!-- List item --> <!ELEMENT li (%content.mix;|%lists;)*> <!ATTLIST li %common.att;> <!-- Definition list (typically two-column) --> <!ELEMENT dl (dt,dd)+> <!ATTLIST dl %common.att;> <!-- Definition term --> <!ELEMENT dt (%content.mix;)*> <!ATTLIST dt %common.att;> <!-- Definition description --> <!ELEMENT dd (%content.mix;)*> <!ATTLIST dd %common.att;> <!-- ==================================================== --> <!-- Special Blocks --> <!-- ==================================================== --> <!-- Image Block (typically a separated and centered image) --> <!-- FIXME (SM): should we have the notion of soft links even here for inlined objects? --> <!ELEMENT img-block EMPTY> <!ATTLIST img-block src CDATA #REQUIRED alt CDATA #REQUIRED height CDATA #IMPLIED width CDATA #IMPLIED usemap CDATA #IMPLIED ismap (ismap) #IMPLIED %common.att;> <!-- =============================================================== --> <!-- Document --> <!-- =============================================================== --> <!ELEMENT document (header?, body, footer?)> <!ATTLIST document %common.att;> <!-- ==================================================== --> <!-- Header --> <!-- ==================================================== --> <!ENTITY % local.headers ""> <!ELEMENT header (title, subtitle?, version?, type?, authors, notice*, abstract? %local.headers;)> <!ATTLIST header %common.att;> <!ELEMENT title (%text;)> <!ATTLIST title %common.att;> <!ELEMENT subtitle (%text;)> <!ATTLIST subtitle %common.att;> <!ELEMENT version (%text;)> <!ATTLIST version %common.att;> <!ELEMENT type (%text;)> <!ATTLIST type %common.att;> <!ELEMENT authors (person+)> <!ATTLIST authors %common.att;> <!ELEMENT notice (%content.mix;)*> <!ATTLIST notice %common.att;> <!ELEMENT abstract (%content.mix;)*> <!ATTLIST abstract %common.att;> <!-- ==================================================== --> <!-- Body --> <!-- ==================================================== --> <!ENTITY % local.sections ""> <!ENTITY % sections "s1 %local.sections;"> <!ELEMENT body (%sections;)+> <!ATTLIST body %common.att;> <!ELEMENT s1 (s2|%blocks;)*> <!ATTLIST s1 %title.att; %common.att;> <!ELEMENT s2 (s3|%blocks;)*> <!ATTLIST s2 %title.att; %common.att;> <!ELEMENT s3 (s4|%blocks;)*> <!ATTLIST s3 %title.att; %common.att;> <!ELEMENT s4 (%blocks;)*> <!ATTLIST s4 %title.att; %common.att;> <!-- ==================================================== --> <!-- Footer --> <!-- ==================================================== --> <!ENTITY % local.footers ""> <!ELEMENT footer (legal %local.footers;)> <!ELEMENT legal (%content.mix;)*> <!ATTLIST legal %common.att;> <!-- =============================================================== --> <!-- End of DTD --> <!-- =============================================================== --> -------------- next part -------------- <!-- Portions (C) International Organization for Standardization 1986 Permission to copy in any form is granted for use with conforming SGML systems and applications as defined in ISO 8879, provided this notice is included in all copies. --> <!-- Character entity set. --> <!-- Latin A --> <!ENTITY nbsp " "> <!-- U+00A0 ISOnum - no-break space = non-breaking space --> <!ENTITY iexcl "¡"> <!-- U+00A1 ISOnum - inverted exclamation mark --> <!ENTITY cent "¢"> <!-- U+00A2 ISOnum - cent sign --> <!ENTITY pound "£"> <!-- U+00A3 ISOnum - pound sign --> <!ENTITY curren "¤"> <!-- U+00A4 ISOnum - currency sign --> <!ENTITY yen "¥"> <!-- U+00A5 ISOnum - yen sign = yuan sign --> <!ENTITY brvbar "¦"> <!-- U+00A6 ISOnum - broken bar = broken vertical bar --> <!ENTITY sect "§"> <!-- U+00A7 ISOnum - section sign --> <!ENTITY uml "¨"> <!-- U+00A8 ISOdia - diaeresis = spacing diaeresis --> <!ENTITY copy "©"> <!-- U+00A9 ISOnum - copyright sign --> <!ENTITY ordf "ª"> <!-- U+00AA ISOnum - feminine ordinal indicator --> <!ENTITY laquo "«"> <!-- U+00AB ISOnum - left-pointing double angle quotation mark = left pointing guillemet --> <!ENTITY not "¬"> <!-- U+00AC ISOnum - not sign --> <!ENTITY shy "­"> <!-- U+00AD ISOnum - soft hyphen = discretionary hyphen --> <!ENTITY reg "®"> <!-- U+00AE ISOnum - registered sign = registered trade mark sign --> <!ENTITY macr "¯"> <!-- U+00AF ISOdia - macron = spacing macron = overline = APL overbar --> <!ENTITY deg "°"> <!-- U+00B0 ISOnum - degree sign --> <!ENTITY plusmn "±"> <!-- U+00B1 ISOnum - plus-minus sign = plus-or-minus sign --> <!ENTITY sup2 "²"> <!-- U+00B2 ISOnum - superscript two = superscript digit two = squared --> <!ENTITY sup3 "³"> <!-- U+00B3 ISOnum - superscript three = superscript digit three = cubed --> <!ENTITY acute "´"> <!-- U+00B4 ISOdia - acute accent = spacing acute --> <!ENTITY micro "µ"> <!-- U+00B5 ISOnum - micro sign --> <!ENTITY para "¶"> <!-- U+00B6 ISOnum - pilcrow sign = paragraph sign --> <!ENTITY middot "·"> <!-- U+00B7 ISOnum - middle dot = Georgian comma = Greek middle dot --> <!ENTITY cedil "¸"> <!-- U+00B8 ISOdia - cedilla = spacing cedilla --> <!ENTITY sup1 "¹"> <!-- U+00B9 ISOnum - superscript one = superscript digit one --> <!ENTITY ordm "º"> <!-- U+00BA ISOnum - masculine ordinal indicator --> <!ENTITY raquo "»"> <!-- U+00BB ISOnum - right-pointing double angle quotation mark = right pointing guillemet --> <!ENTITY frac14 "¼"> <!-- U+00BC ISOnum - vulgar fraction one quarter = fraction one quarter --> <!ENTITY frac12 "½"> <!-- U+00BD ISOnum - vulgar fraction one half = fraction one half --> <!ENTITY frac34 "¾"> <!-- U+00BE ISOnum - vulgar fraction three quarters = fraction three quarters --> <!ENTITY iquest "¿"> <!-- U+00BF ISOnum - inverted question mark = turned question mark --> <!ENTITY Agrave "À"> <!-- U+00C0 ISOlat1 - latin capital letter A with grave = latin capital letter A grave --> <!ENTITY Aacute "Á"> <!-- U+00C1 ISOlat1 - latin capital letter A with acute --> <!ENTITY Acirc "Â"> <!-- U+00C2 ISOlat1 - latin capital letter A with circumflex --> <!ENTITY Atilde "Ã"> <!-- U+00C3 ISOlat1 - latin capital letter A with tilde --> <!ENTITY Auml "Ä"> <!-- U+00C4 ISOlat1 - latin capital letter A with diaeresis --> <!ENTITY Aring "Å"> <!-- U+00C5 ISOlat1 - latin capital letter A with ring above = latin capital letter A ring --> <!ENTITY AElig "Æ"> <!-- U+00C6 ISOlat1 - latin capital letter AE = latin capital ligature AE --> <!ENTITY Ccedil "Ç"> <!-- U+00C7 ISOlat1 - latin capital letter C with cedilla --> <!ENTITY Egrave "È"> <!-- U+00C8 ISOlat1 - latin capital letter E with grave --> <!ENTITY Eacute "É"> <!-- U+00C9 ISOlat1 - latin capital letter E with acute --> <!ENTITY Ecirc "Ê"> <!-- U+00CA ISOlat1 - latin capital letter E with circumflex --> <!ENTITY Euml "Ë"> <!-- U+00CB ISOlat1 - latin capital letter E with diaeresis --> <!ENTITY Igrave "Ì"> <!-- U+00CC ISOlat1 - latin capital letter I with grave --> <!ENTITY Iacute "Í"> <!-- U+00CD ISOlat1 - latin capital letter I with acute --> <!ENTITY Icirc "Î"> <!-- U+00CE ISOlat1 - latin capital letter I with circumflex --> <!ENTITY Iuml "Ï"> <!-- U+00CF ISOlat1 - latin capital letter I with diaeresis --> <!ENTITY ETH "Ð"> <!-- U+00D0 ISOlat1 - latin capital letter ETH --> <!ENTITY Ntilde "Ñ"> <!-- U+00D1 ISOlat1 - latin capital letter N with tilde --> <!ENTITY Ograve "Ò"> <!-- U+00D2 ISOlat1 - latin capital letter O with grave --> <!ENTITY Oacute "Ó"> <!-- U+00D3 ISOlat1 - latin capital letter O with acute --> <!ENTITY Ocirc "Ô"> <!-- U+00D4 ISOlat1 - latin capital letter O with circumflex --> <!ENTITY Otilde "Õ"> <!-- U+00D5 ISOlat1 - latin capital letter O with tilde --> <!ENTITY Ouml "Ö"> <!-- U+00D6 ISOlat1 - latin capital letter O with diaeresis --> <!ENTITY times "×"> <!-- U+00D7 ISOnum - multiplication sign --> <!ENTITY Oslash "Ø"> <!-- U+00D8 ISOlat1 - latin capital letter O with stroke = latin capital letter O slash --> <!ENTITY Ugrave "Ù"> <!-- U+00D9 ISOlat1 - latin capital letter U with grave --> <!ENTITY Uacute "Ú"> <!-- U+00DA ISOlat1 - latin capital letter U with acute --> <!ENTITY Ucirc "Û"> <!-- U+00DB ISOlat1 - latin capital letter U with circumflex --> <!ENTITY Uuml "Ü"> <!-- U+00DC ISOlat1 - latin capital letter U with diaeresis --> <!ENTITY Yacute "Ý"> <!-- U+00DD ISOlat1 - latin capital letter Y with acute --> <!ENTITY THORN "Þ"> <!-- U+00DE ISOlat1 - latin capital letter THORN --> <!ENTITY szlig "ß"> <!-- U+00DF ISOlat1 - latin small letter sharp s = ess-zed --> <!ENTITY agrave "à"> <!-- U+00E0 ISOlat1 - latin small letter a with grave = latin small letter a grave --> <!ENTITY aacute "á"> <!-- U+00E1 ISOlat1 - latin small letter a with acute --> <!ENTITY acirc "â"> <!-- U+00E2 ISOlat1 - latin small letter a with circumflex --> <!ENTITY atilde "ã"> <!-- U+00E3 ISOlat1 - latin small letter a with tilde --> <!ENTITY auml "ä"> <!-- U+00E4 ISOlat1 - latin small letter a with diaeresis --> <!ENTITY aring "å"> <!-- U+00E5 ISOlat1 - latin small letter a with ring above = latin small letter a ring --> <!ENTITY aelig "æ"> <!-- U+00E6 ISOlat1 - latin small letter ae = latin small ligature ae --> <!ENTITY ccedil "ç"> <!-- U+00E7 ISOlat1 - latin small letter c with cedilla --> <!ENTITY egrave "è"> <!-- U+00E8 ISOlat1 - latin small letter e with grave --> <!ENTITY eacute "é"> <!-- U+00E9 ISOlat1 - latin small letter e with acute --> <!ENTITY ecirc "ê"> <!-- U+00EA ISOlat1 - latin small letter e with circumflex --> <!ENTITY euml "ë"> <!-- U+00EB ISOlat1 - latin small letter e with diaeresis --> <!ENTITY igrave "ì"> <!-- U+00EC ISOlat1 - latin small letter i with grave --> <!ENTITY iacute "í"> <!-- U+00ED ISOlat1 - latin small letter i with acute --> <!ENTITY icirc "î"> <!-- U+00EE ISOlat1 - latin small letter i with circumflex --> <!ENTITY iuml "ï"> <!-- U+00EF ISOlat1 - latin small letter i with diaeresis --> <!ENTITY eth "ð"> <!-- U+00F0 ISOlat1 - latin small letter eth --> <!ENTITY ntilde "ñ"> <!-- U+00F1 ISOlat1 - latin small letter n with tilde --> <!ENTITY ograve "ò"> <!-- U+00F2 ISOlat1 - latin small letter o with grave --> <!ENTITY oacute "ó"> <!-- U+00F3 ISOlat1 - latin small letter o with acute --> <!ENTITY ocirc "ô"> <!-- U+00F4 ISOlat1 - latin small letter o with circumflex --> <!ENTITY otilde "õ"> <!-- U+00F5 ISOlat1 - latin small letter o with tilde --> <!ENTITY ouml "ö"> <!-- U+00F6 ISOlat1 - latin small letter o with diaeresis --> <!ENTITY divide "÷"> <!-- U+00F7 ISOnum - division sign --> <!ENTITY oslash "ø"> <!-- U+00F8 ISOlat1 - latin small letter o with stroke = latin small letter o slash --> <!ENTITY ugrave "ù"> <!-- U+00F9 ISOlat1 - latin small letter u with grave --> <!ENTITY uacute "ú"> <!-- U+00FA ISOlat1 - latin small letter u with acute --> <!ENTITY ucirc "û"> <!-- U+00FB ISOlat1 - latin small letter u with circumflex --> <!ENTITY uuml "ü"> <!-- U+00FC ISOlat1 - latin small letter u with diaeresis --> <!ENTITY yacute "ý"> <!-- U+00FD ISOlat1 - latin small letter y with acute --> <!ENTITY thorn "þ"> <!-- U+00FE ISOlat1 - latin small letter thorn --> <!ENTITY yuml "ÿ"> <!-- U+00FF ISOlat1 - latin small letter y with diaeresis --> <!-- Latin Extended-A --> <!ENTITY OElig "Œ"> <!-- U+0152 ISOlat2 - latin capital ligature OE --> <!ENTITY oelig "œ"> <!-- U+0153 ISOlat2 - latin small ligature oe --> <!-- ligature is a misnomer, this is a separate character in some languages --> <!ENTITY Scaron "Š"> <!-- U+0160 ISOlat2 - latin capital letter S with caron --> <!ENTITY scaron "š"> <!-- U+0161 ISOlat2 - latin small letter s with caron --> <!ENTITY Yuml "Ÿ"> <!-- U+0178 ISOlat2 - latin capital letter Y with diaeresis --> <!-- Spacing Modifier Letters --> <!ENTITY circ "ˆ"> <!-- U+02C6 ISOpub - modifier letter circumflex accent --> <!ENTITY tilde "˜"> <!-- U+02DC ISOdia - small tilde --> <!-- General Punctuation --> <!ENTITY ensp " "> <!-- U+2002 ISOpub - en space --> <!ENTITY emsp " "> <!-- U+2003 ISOpub - em space --> <!ENTITY thinsp " "> <!-- U+2009 ISOpub - thin space --> <!ENTITY zwnj "‌"> <!-- U+200C RFC 2070 - zero width non-joiner --> <!ENTITY zwj "‍"> <!-- U+200D RFC 2070 - zero width joiner --> <!ENTITY lrm "‎"> <!-- U+200E RFC 2070 - left-to-right mark --> <!ENTITY rlm "‏"> <!-- U+200F RFC 2070 - right-to-left mark --> <!ENTITY ndash "–"> <!-- U+2013 ISOpub - en dash --> <!ENTITY mdash "—"> <!-- U+2014 ISOpub - em dash --> <!ENTITY lsquo "‘"> <!-- U+2018 ISOnum - left single quotation mark --> <!ENTITY rsquo "’"> <!-- U+2019 ISOnum - right single quotation mark --> <!ENTITY sbquo "‚"> <!-- U+201A NEW - single low-9 quotation mark --> <!ENTITY ldquo "“"> <!-- U+201C ISOnum - left double quotation mark --> <!ENTITY rdquo "”"> <!-- U+201D ISOnum - right double quotation mark, --> <!ENTITY bdquo "„"> <!-- U+201E NEW - double low-9 quotation mark --> <!ENTITY dagger "†"> <!-- U+2020 ISOpub - dagger --> <!ENTITY Dagger "‡"> <!-- U+2021 ISOpub - double dagger --> <!ENTITY permil "‰"> <!-- U+2030 ISOtech - per mille sign --> <!ENTITY lsaquo "‹"> <!-- U+2039 ISO prop. - single left-pointing angle quotation mark --> <!-- lsaquo is proposed but not yet ISO standardized --> <!ENTITY rsaquo "›"> <!-- U+203A ISO prop. - single right-pointing angle quotation mark --> <!-- rsaquo is proposed but not yet ISO standardized --> <!ENTITY euro "€"> <!-- U+20AC NEW - euro sign --> <!-- Latin Extended-B --> <!ENTITY fnof "ƒ"> <!-- U+0192 ISOtech - latin small f with hook = function = florin --> <!-- Greek --> <!ENTITY Alpha "Α"> <!-- U+0391 - greek capital letter alpha --> <!ENTITY Beta "Β"> <!-- U+0392 - greek capital letter beta --> <!ENTITY Gamma "Γ"> <!-- U+0393 ISOgrk3 - greek capital letter gamma --> <!ENTITY Delta "Δ"> <!-- U+0394 ISOgrk3 - greek capital letter delta --> <!ENTITY Epsilon "Ε"> <!-- U+0395 - greek capital letter epsilon --> <!ENTITY Zeta "Ζ"> <!-- U+0396 - greek capital letter zeta --> <!ENTITY Eta "Η"> <!-- U+0397 - greek capital letter eta --> <!ENTITY Theta "Θ"> <!-- U+0398 ISOgrk3 - greek capital letter theta --> <!ENTITY Iota "Ι"> <!-- U+0399 - greek capital letter iota --> <!ENTITY Kappa "Κ"> <!-- U+039A - greek capital letter kappa --> <!ENTITY Lambda "Λ"> <!-- U+039B ISOgrk3 - greek capital letter lambda --> <!ENTITY Mu "Μ"> <!-- U+039C - greek capital letter mu --> <!ENTITY Nu "Ν"> <!-- U+039D - greek capital letter nu --> <!ENTITY Xi "Ξ"> <!-- U+039E ISOgrk3 - greek capital letter xi --> <!ENTITY Omicron "Ο"> <!-- U+039F - greek capital letter omicron --> <!ENTITY Pi "Π"> <!-- U+03A0 ISOgrk3 - greek capital letter pi --> <!ENTITY Rho "Ρ"> <!-- U+03A1 - greek capital letter rho --> <!ENTITY Sigma "Σ"> <!-- U+03A3 ISOgrk3 - greek capital letter sigma --> <!ENTITY Tau "Τ"> <!-- U+03A4 - greek capital letter tau --> <!ENTITY Upsilon "Υ"> <!-- U+03A5 ISOgrk3 - greek capital letter upsilon --> <!ENTITY Phi "Φ"> <!-- U+03A6 ISOgrk3 - greek capital letter phi --> <!ENTITY Chi "Χ"> <!-- U+03A7 - greek capital letter chi --> <!ENTITY Psi "Ψ"> <!-- U+03A8 ISOgrk3 - greek capital letter psi --> <!ENTITY Omega "Ω"> <!-- U+03A9 ISOgrk3 - greek capital letter omega --> <!ENTITY alpha "α"> <!-- U+03B1 ISOgrk3 - greek small letter alpha --> <!ENTITY beta "β"> <!-- U+03B2 ISOgrk3 - greek small letter beta --> <!ENTITY gamma "γ"> <!-- U+03B3 ISOgrk3 - greek small letter gamma --> <!ENTITY delta "δ"> <!-- U+03B4 ISOgrk3 - greek small letter delta --> <!ENTITY epsilon "ε"> <!-- U+03B5 ISOgrk3 - greek small letter epsilon --> <!ENTITY zeta "ζ"> <!-- U+03B6 ISOgrk3 - greek small letter zeta --> <!ENTITY eta "η"> <!-- U+03B7 ISOgrk3 - greek small letter eta --> <!ENTITY theta "θ"> <!-- U+03B8 ISOgrk3 - greek small letter theta --> <!ENTITY iota "ι"> <!-- U+03B9 ISOgrk3 - greek small letter iota --> <!ENTITY kappa "κ"> <!-- U+03BA ISOgrk3 - greek small letter kappa --> <!ENTITY lambda "λ"> <!-- U+03BB ISOgrk3 - greek small letter lambda --> <!ENTITY mu "μ"> <!-- U+03BC ISOgrk3 - greek small letter mu --> <!ENTITY nu "ν"> <!-- U+03BD ISOgrk3 - greek small letter nu --> <!ENTITY xi "ξ"> <!-- U+03BE ISOgrk3 - greek small letter xi --> <!ENTITY omicron "ο"> <!-- U+03BF NEW - greek small letter omicron --> <!ENTITY pi "π"> <!-- U+03C0 ISOgrk3 - greek small letter pi --> <!ENTITY rho "ρ"> <!-- U+03C1 ISOgrk3 - greek small letter rho --> <!ENTITY sigmaf "ς"> <!-- U+03C2 ISOgrk3 - greek small letter final sigma --> <!ENTITY sigma "σ"> <!-- U+03C3 ISOgrk3 - greek small letter sigma --> <!ENTITY tau "τ"> <!-- U+03C4 ISOgrk3 - greek small letter tau --> <!ENTITY upsilon "υ"> <!-- U+03C5 ISOgrk3 - greek small letter upsilon --> <!ENTITY phi "φ"> <!-- U+03C6 ISOgrk3 - greek small letter phi --> <!ENTITY chi "χ"> <!-- U+03C7 ISOgrk3 - greek small letter chi --> <!ENTITY psi "ψ"> <!-- U+03C8 ISOgrk3 - greek small letter psi --> <!ENTITY omega "ω"> <!-- U+03C9 ISOgrk3 - greek small letter omega --> <!ENTITY thetasym "ϑ"> <!-- U+03D1 NEW - greek small letter theta symbol --> <!ENTITY upsih "ϒ"> <!-- U+03D2 NEW - greek upsilon with hook symbol --> <!ENTITY piv "ϖ"> <!-- U+03D6 ISOgrk3 - greek pi symbol --> <!-- General Punctuation --> <!ENTITY bull "•"> <!-- U+2022 ISOpub - bullet = black small circle --> <!ENTITY hellip "…"> <!-- U+2026 ISOpub - horizontal ellipsis = three dot leader --> <!ENTITY prime "′"> <!-- U+2032 ISOtech - prime = minutes = feet --> <!ENTITY Prime "″"> <!-- U+2033 ISOtech - double prime = seconds = inches --> <!ENTITY oline "‾"> <!-- U+203E NEW - overline = spacing overscore --> <!ENTITY frasl "⁄"> <!-- U+2044 NEW - fraction slash --> <!-- Letterlike Symbols --> <!ENTITY weierp "℘"> <!-- U+2118 ISOamso - script capital P = power set = Weierstrass p --> <!ENTITY image "ℑ"> <!-- U+2111 ISOamso - blackletter capital I = imaginary part --> <!ENTITY real "ℜ"> <!-- U+211C ISOamso - blackletter capital R = real part symbol --> <!ENTITY trade "™"> <!-- U+2122 ISOnum - trade mark sign --> <!ENTITY alefsym "ℵ"> <!-- U+2135 NEW - alef symbol = first transfinite cardinal --> <!-- Arrows --> <!ENTITY larr "←"> <!-- U+2190 ISOnum - leftwards arrow --> <!ENTITY uarr "↑"> <!-- U+2191 ISOnum - upwards arrow --> <!ENTITY rarr "→"> <!-- U+2192 ISOnum - rightwards arrow --> <!ENTITY darr "↓"> <!-- U+2193 ISOnum - downwards arrow --> <!ENTITY harr "↔"> <!-- U+2194 ISOamsa - left right arrow --> <!ENTITY crarr "↵"> <!-- U+21B5 NEW - downwards arrow with corner leftwards = carriage return --> <!ENTITY lArr "⇐"> <!-- U+21D0 ISOtech - leftwards double arrow --> <!ENTITY uArr "⇑"> <!-- U+21D1 ISOamsa - upwards double arrow --> <!ENTITY rArr "⇒"> <!-- U+21D2 ISOtech - rightwards double arrow --> <!ENTITY dArr "⇓"> <!-- U+21D3 ISOamsa - downwards double arrow --> <!ENTITY hArr "⇔"> <!-- U+21D4 ISOamsa - left right double arrow --> <!-- Mathematical Operators --> <!ENTITY forall "∀"> <!-- U+2200 ISOtech - for all --> <!ENTITY part "∂"> <!-- U+2202 ISOtech - partial differential --> <!ENTITY exist "∃"> <!-- U+2203 ISOtech - there exists --> <!ENTITY empty "∅"> <!-- U+2205 ISOamso - empty set = null set = diameter --> <!ENTITY nabla "∇"> <!-- U+2207 ISOtech - nabla = backward difference --> <!ENTITY isin "∈"> <!-- U+2208 ISOtech - element of --> <!ENTITY notin "∉"> <!-- U+2209 ISOtech - not an element of --> <!ENTITY ni "∋"> <!-- U+220B ISOtech - contains as member --> <!ENTITY prod "∏"> <!-- U+220F ISOamsb - n-ary product = product sign --> <!ENTITY sum "∑"> <!-- U+2211 ISOamsb - n-ary sumation --> <!ENTITY minus "−"> <!-- U+2212 ISOtech - minus sign --> <!ENTITY lowast "∗"> <!-- U+2217 ISOtech - asterisk operator --> <!ENTITY radic "√"> <!-- U+221A ISOtech - square root = radical sign --> <!ENTITY prop "∝"> <!-- U+221D ISOtech - proportional to --> <!ENTITY infin "∞"> <!-- U+221E ISOtech - infinity --> <!ENTITY ang "∠"> <!-- U+2220 ISOamso - angle --> <!ENTITY and "∧"> <!-- U+2227 ISOtech - logical and = wedge --> <!ENTITY or "∨"> <!-- U+2228 ISOtech - logical or = vee --> <!ENTITY cap "∩"> <!-- U+2229 ISOtech - intersection = cap --> <!ENTITY cup "∪"> <!-- U+222A ISOtech - union = cup --> <!ENTITY int "∫"> <!-- U+222B ISOtech - integral --> <!ENTITY there4 "∴"> <!-- U+2234 ISOtech - therefore --> <!ENTITY sim "∼"> <!-- U+223C ISOtech - tilde operator = varies with = similar to --> <!ENTITY cong "≅"> <!-- U+2245 ISOtech - approximately equal to --> <!ENTITY asymp "≈"> <!-- U+2248 ISOamsr - almost equal to = asymptotic to --> <!ENTITY ne "≠"> <!-- U+2260 ISOtech - not equal to --> <!ENTITY equiv "≡"> <!-- U+2261 ISOtech - identical to --> <!ENTITY le "≤"> <!-- U+2264 ISOtech - less-than or equal to --> <!ENTITY ge "≥"> <!-- U+2265 ISOtech - greater-than or equal to --> <!ENTITY sub "⊂"> <!-- U+2282 ISOtech - subset of --> <!ENTITY sup "⊃"> <!-- U+2283 ISOtech - superset of --> <!ENTITY nsub "⊄"> <!-- U+2284 ISOamsn - not a subset of --> <!ENTITY sube "⊆"> <!-- U+2286 ISOtech - subset of or equal to --> <!ENTITY supe "⊇"> <!-- U+2287 ISOtech - superset of or equal to --> <!ENTITY oplus "⊕"> <!-- U+2295 ISOamsb - circled plus = direct sum --> <!ENTITY otimes "⊗"> <!-- U+2297 ISOamsb - circled times = vector product --> <!ENTITY perp "⊥"> <!-- U+22A5 ISOtech - up tack = orthogonal to = perpendicular --> <!ENTITY sdot "⋅"> <!-- U+22C5 ISOamsb - dot operator --> <!-- Miscellaneous Technical --> <!ENTITY lceil "⌈"> <!-- U+2308 ISOamsc - left ceiling = apl upstile --> <!ENTITY rceil "⌉"> <!-- U+2309 ISOamsc - right ceiling --> <!ENTITY lfloor "⌊"> <!-- U+230A ISOamsc - left floor = apl downstile --> <!ENTITY rfloor "⌋"> <!-- U+230B ISOamsc - right floor --> <!ENTITY lang "〈"> <!-- U+2329 ISOtech - left-pointing angle bracket = bra --> <!ENTITY rang "〉"> <!-- U+232A ISOtech - right-pointing angle bracket = ket --> <!-- Geometric Shapes --> <!ENTITY loz "◊"> <!-- U+25CA ISOpub - lozenge --> <!-- Miscellaneous Symbols --> <!ENTITY spades "♠"> <!-- U+2660 ISOpub - black spade suit --> <!ENTITY clubs "♣"> <!-- U+2663 ISOpub - black club suit = shamrock --> <!ENTITY hearts "♥"> <!-- U+2665 ISOpub - black heart suit = valentine --> <!ENTITY diams "♦"> <!-- U+2666 ISOpub - black diamond suit --> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991206/ade7393b/installing.htm From sb at metis.no Mon Dec 6 14:34:04 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: First interface draft In-Reply-To: David Megginson's message of "06 Dec 1999 08:43:05 -0500" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no> <m3zovonq1i.fsf@localhost.localdomain> Message-ID: <whd7skjfzk.fsf@viffer.oslo.metis.no> >>>>> David Megginson <david@megginson.com>: > Steinar Bang <sb@metis.no> writes: >> And Objectspace Standards<ToolKit> is not compatible with the Standard >> C++ Library iostreams of MSVC++. > I think that this goes beyond the scope of SAX -- we have to be able > to assume at least a basic level of ANSI-C++ conformance, or else > we'll end up rewriting the whole standard library. I'm willing not > to beat up on the hairier features (like templates), but we have to > be able to count on the basics. Then you have to define what this basic level of ANSI-C++ conformance consists of: templates (at what level), namespaces, standard library components inside the std::namespace, the existence of the parts of the standard C++ library, and which parts. We decided to go for a basic conformance of standard C++, when we started the project I'm working on, in the summer of 1995. We have to use Standards<ToolKit> precisely because the standard C++ things we use (such as STL and basic_string<>) that are delivered with MSVC++ is broken. Our "basic conformance" consists of using STL (extensively) and string. We've stayed away from namespaces, and exotic iostream features (code_cvt<>, templated streams with wide streams). We've used templates in our own code, without much incident (the compilers are Sunpro C++, gcc/egcs and MSVC++), but we've moved away from that because they caused code bloat. A minor annoyance with the C++ mapping of CORBA IDL, is that it used an exotic and low priority (among implementers) feature like namespaces, and disregarded useful standard C++ things like vector<> (for sequence<>) and std::basic_string<>. I would hope to avoid the same reasoning for SAX. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 14:38:20 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++: UTF-8 v UTF-16 In-Reply-To: Lars Marius Garshol's message of "06 Dec 1999 14:59:29 +0100" References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com> <whyabcqijv.fsf@viffer.oslo.metis.no> <m3emd0npa6.fsf@ifi.uio.no> Message-ID: <wh9038jfsd.fsf@viffer.oslo.metis.no> >>>>> Lars Marius Garshol <larsga@garshol.priv.no>: > * James Clark >> >> Unfortunately wchar_t isn't guaranteed to be UTF-16. Some platforms >> make it 32-bits. > gcc 2.95 on Linux does, at least. I don't know what it does on other > platforms. Ugh... that would be one of my target compilers eventually(*). :-/ I saw some discussion on the libstc++-v3 mailing list about having to make wchar_t 32 bit, to make it able to hold UCS-4. I didn't know it ended up being the case. (*) I'm currently using 1.2-pre2, because it was the first one that worked for me since egcs-1.0.3 and I probably won't upgrade again until gcc is reasonably stable. And I'm currently not using wstring and wchar_t xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at xmission.com Mon Dec 6 14:40:21 1999 From: ray at xmission.com (Ray Whitmer) Date: Mon Jun 7 17:18:23 2004 Subject: SAX/C++ vs. SAX2 References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk> <m31z90p554.fsf@ifi.uio.no> Message-ID: <384BCE3F.168A7CB@xmission.com> Lars Marius Garshol wrote: > * Vilya Harvey > | > | Just a thought: why not take a leaf out of the DOM's book and write > | the canonical version of the SAX interfaces in a language-neutral > | format like IDL? > > This may sound like a good idea, but it has its drawbacks in that one > is immediately forced into a lowest common denominator design where it > is impossible to make use of the features that really make each > language what they are. Just to clarify, if IDL stub generators were being used with the DOM spec, this would be true, which is the normal way to use IDL. This is not how IDL is being used by the DOM specification. It simply forms a neutral starting point. [...] > Nor are language naming conventions respected. (startElement should > really be startElement (in Java), start_element (in C++, Python, IDL) > and start-element (in Common Lisp/Scheme) and there may even be more > variations. I don't understand your need to promote arbitrary style differences which have nothing to do with the language, which your example here seems to demonstrate. I find the statement that startElement should be start_element in C++ and IDL far from obvious, although it may need to be true now for Legacy reasons. The mixed casing that Java uses was borrowed from C++ specs, and is common there. > As a general reference and statement of intent it might have some > value, but I really think translation should be done by humans. The > main advantage feature of IDL, cross-process and cross-language > interoperability, is not really all that valuable for SAX anyway. I agree, and this is the philosophy behind the DOM's use of IDL -- let each binding adapt it as necessary (into a single spec for that binding, not in as many different ways as desired). The disadvantages using IDL is that people will try to use it with an IDL compiler, and/or neglect to publish single human-derived bindings for specific languages so variety of mutations could spring up for a particular language, as has happened in certain cases with DOM. Ray Whitmer ray@xmission.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 14:41:03 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: Changes for C++ In-Reply-To: Lars Marius Garshol's message of "06 Dec 1999 15:06:25 +0100" References: <14406.59075.218048.437305@localhost.localdomain> <wh7liwthnr.fsf@viffer.oslo.metis.no> <m3d7sknoym.fsf@ifi.uio.no> Message-ID: <wh4sdwjfon.fsf@viffer.oslo.metis.no> >>>>> Lars Marius Garshol <larsga@garshol.priv.no>: > We added support for this as an extension in the Python version of > SAX, since several of the Python parsers support this (xmllib, > xmlproc and pyexpat). This was simply done by adding three methods > on the extended parser interface: reset, feed and close. > For C++ SAX2 this might be done through a property > (http://.../push-stream) which returns a PushStream implementation > with these three methods to allow you to push data into parsers > which support this. Either one of these would be fine for me. > Some means of specifying the URL of the document entity is probably > also a good idea, for resolution of relative URLs. ...as well as for error message output. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 14:47:41 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> Message-ID: <384BCC51.936AA275@fxtech.com> This is what I had in mind. Consider this (contrived) XML data-file, that consists of a Title, Author, and one or more Paragraph elements: <Document name="mydoc.doc"> <Title>Sample XML Document Paul Miller This is the first paragraph. This is the second paragraph. Now, expat and SAX only give you the elements, so you have to keep track of where you are in the document in the element handler yourself. What I have in mind is a nestable set of registered element handlers, implemented as callbacks. The callbacks are static function pointers, since I want a non-intrusive design. With this example, I assume two primary classes (Document and Paragraph). Although Title and Author are represented as elements here, they are really attributes of the Document object. Now consider this code to parse it: void ParseDocument(XML::InputStream &in) { XML::ElementHandler handlers[] = { XML::ElementHandler("Document", sParseDocument), XML::ElementHandler::END }; in.Parse(handlers, NULL); // NULL is optional user-data } static void sParseDocument(XML::InputStream &in, XML::Element &elem, void *userData) { // query the name attribute std::string docName; elem.GetAttribute("name", docName); // create a new document with this name Document *doc = new Document(docName); XML::ElementHandler handlers[] = { XML::ElementHandler("Title", sParseTitle), XML::ElementHandler("Author", sParseAuthor), XML::ElementHandler("Paragraph", sParseParagraph), XML::ElementHandler::END }; // parse the document elements in.Parse(handlers, doc); } static void Document::sParseTitle(XML::InputStream &in, XML::Element &elem, void *userData) { Document *doc = (Document *)userData; doc->SetTitle(elem.GetData()); } static void Document::sParseAuthor(XML::InputStream &in, XML::Element &elem, void *userData) { Document *doc = (Document *)userData; doc->SetAuthor(elem.GetData(), elem.GetAttribute()); } static void Document::sParseParagraph(XML::InputStream &in, XML::Element &elem, void *userData) { Document *doc = (Document *)userData; Paragraph *para = new Paragraph; para->Parse(in, elem); doc->AddParagraph(para); } void Paragraph::Parse(XML::InputStream &in, XML::Element &elem) { SetText(elem.GetData()); } The major idea here is you register everything up-front, and element-specific callbacks get called to deal with specific elements. You can start up parsing inside an element, so you can nest parsing at the object level. Comments? -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john.aldridge at informatix.co.uk Mon Dec 6 15:01:44 1999 From: john.aldridge at informatix.co.uk (John Aldridge) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: First interface draft In-Reply-To: References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <3.0.6.32.19991206150103.009a1c10@mailhost> At 15:33 06/12/99 +0100, Steinar Bang wrote: >Our "basic conformance" consists of using STL (extensively) and >string. We've stayed away from namespaces, and exotic iostream >features (code_cvt<>, templated streams with wide streams). We've >used templates in our own code, without much incident (the compilers >are Sunpro C++, gcc/egcs and MSVC++), but we've moved away from that >because they caused code bloat. We're using MSVC 6 here, and basic_string<> seems fine. We use templates extensively (both the STL and our own), and they too give little trouble _except_ when it comes to exporting template instantiations across DLL boundaries, which takes considerable care (but can usually be managed). Namespaces are fine too. Member templates only half work, and are probably worth avoiding; and I've no detailed experience with iostreams, so cannot comment on how safe it is to dig into the murky corners there. I've got a reasonable amount of experience with DEC C++, and it's also in good shape in these regards. I've no recent Unix/g++ experience, though. I think the days of having to avoid large chunks of the C++ standard are largely over, thank heavens. -- Cheers, John xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eoin_lane at esatclear.ie Mon Dec 6 15:02:08 1999 From: eoin_lane at esatclear.ie (Eoin Lane) Date: Mon Jun 7 17:18:24 2004 Subject: PSGML-1.2.1 problems Message-ID: <384BCFC8.25F7721E@esatclear.ie> I'm trying to write a xml doc with emacs configured to use psgml-1.2.1 but am having some problems. I have checked that psgml works with a simple dtd. However when I use the dtd (document-v10.dtd) below I get the following error. ~/character.ent line 2 col 12 entity common.att ~/document-v10.dtd line 218 col 29 entity DOCUMENT ~/installing.xml line 3 col 51 Name expected; at: :lang I wonder could anyone tell me what I am doing wrong. I know the dtd is correct because I checked it with IBM 4j parser and it validated. it would be of great benefit to me if I could use the dtd in emacs so any help would be greatly appreciated. Eoin. -- Dr. Eoin Lane InConn Technologies Ltd. 17 Washington St. Cork. Tel. (021) 271855 Fax (021) 272419 http://www.inconn.ie mailto:eoinlane@esatclear.ie -------------- next part -------------- %charEntity; -------------- next part -------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991206/06c15776/installing.htm From tshw at capitalmarketscompany.com Mon Dec 6 15:08:37 1999 From: tshw at capitalmarketscompany.com (Shaw Tim) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O Message-ID: FWIW, I ended up having different (sub)DocHandlers for the different nesting levels and implementing a Handler stack to push/pop them according to the tags they handled. At least this way you can handle sub-trees fairly simply, and reduce the bulk of code required for situations where you have many tags identifying different sub-trees (and hence semantics). A (minor) problem I had with this was that I looked up the Handlers based on the tag-name - so there's a problem when the same tag is used in different 'contexts'. It would be useful to associate a Handler with a given tag at the parser initialisation level, using some XPath notation to identify the appropriate tag(s). tim > -----Original Message----- > From: Paul Miller [mailto:stele@fxtech.com] > Sent: 06 December 1999 15:11 > To: xml-dev > Subject: Re: simple XML for C++ application data-file I/O > > > > We tried to keep SAX 1.0 as simple as possible -- how would you > > simplify the following further? > > > > public void startElement (String name, AttributeList atts) > > { > > // do something!! > > } > > Here is where I have the problem. This leaves an awful lot up to the > application, still, including handling the proper nesting. I > would like > to make the actual parsing of elements more "automatic", so when a > certain element is hit, it calls a function with my object > pointer where > I can pick up the parsing from there, then drop back out to the > enclosing XML scope and keep going. > > Perhaps what I want to do should be built on SAX instead of expat, > though. > > -- > Paul Miller - stele@fxtech.com > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > ********************************************************************* The information in this email is confidential and is intended solely for the addressee(s). Access to this email by anyone else is unauthorised. If you are not an intended recipient, you must not read, use or disseminate the information contained in the email. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of The Capital Markets Company. http://www.capitalmarketscompany.com *********************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Mon Dec 6 15:09:39 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: <384BCC51.936AA275@fxtech.com> References: <384B04DA.DCD6BAED@fxtech.com> <384BCC51.936AA275@fxtech.com> Message-ID: * Paul Miller | | Comments? It looks good, and in fact this was exactly the sort of thing SAX was designed to allow you to do as a layer above SAX. In Java we already have SAXON and MDSAX which both do this kind of thing. Python already has one interface of this sort and ezsax is likely to become another. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Dec 6 15:10:42 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: Your message of "Mon, 06 Dec 1999 10:13:06 GMT." <384B8C32.22805488@nag.co.uk> Message-ID: <199912061510.IAA03966@localhost.localdomain> > Just a thought: why not take a leaf out of the DOM's book and write the > canonical version of the SAX interfaces in a language-neutral format like > IDL? That way, bindings to a number of languages (including, but not > limited to, C++ and Java) can be trivially derived by using the > appropriate IDL-to-whatever converter. Shh! That's unwelcome talk around here. I advocated using IDL for the official SAX definition a while back, but no-one seemed to deem it worth considering. Of course, we've fallen into exactly the sort of trap that language-specific interface definition causes: people translating to another language all do it differently, and the whole set of discussions must reiterate for language Y. The Python/XML group recently hashed out details of of a Python/DOM binding. Because there is a developed Python/CORBA binding, we knew exactly how to model several key components of the interface. Note that this does not involve taking up _any_ of CORBA's baggage except for interface definition, for which IDL does a brilliant job. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Mon Dec 6 15:11:06 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:24 2004 Subject: A processing instruction for robots In-Reply-To: Message-ID: <000b01bf3ffc$2fc659e0$099918d1@docuverse1> >* Don Park >| >| Walter, >| Could you elaborate your decision to use PI rather than element(s)? > >I'm not Walter, but to me this has the obvious advantage that it can >be used completely orthogonally to the document contents and the >software used to process the document for non-indexing purposes. IMHO, this line of thinking (aka 'sacred content') forces us to use PI or special attributes for extension of document instances. Poor use of the letter 'X' in XML. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Dec 6 15:19:42 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++ vs. SAX2 In-Reply-To: Your message of "06 Dec 1999 14:31:35 +0100." Message-ID: <199912061519.IAA03998@localhost.localdomain> > | Just a thought: why not take a leaf out of the DOM's book and write > | the canonical version of the SAX interfaces in a language-neutral > | format like IDL? > > This may sound like a good idea, but it has its drawbacks in that one > is immediately forced into a lowest common denominator design where it > is impossible to make use of the features that really make each > language what they are. > > Also, IDL does not have convenient ways of mapping to C++ streams, > Java InputStream, Python dictionary-like objects and file-like objects > etc etc > > Another problem is that exceptions are first-class objects in SAX > (which is exploited by the Java and Python mappings), but not in IDL. > > Nor are language naming conventions respected. (startElement should > really be startElement (in Java), start_element (in C++, Python, IDL) > and start-element (in Common Lisp/Scheme) and there may even be more > variations. > > As a general reference and statement of intent it might have some > value, but I really think translation should be done by humans. The > main advantage feature of IDL, cross-process and cross-language > interoperability, is not really all that valuable for SAX anyway. All these problems you bring up are already being addressed by most language groups in the process of developing a CORBA binding. Do you really see such evil in the C++, Java and python bindings for native construction from IDL? I should repeat that not all aspects of CORBA bindings are useful. For instance, the Java binding for actual distributed components requires an ugly explosion of packages to cope with CORBA's semantics (largely that language's own fault for not supporting multiple implementation sharing). But I don't think these problems plague the simple task of mapping IDL to native signatures. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 15:31:57 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O References: <384B04DA.DCD6BAED@fxtech.com> <384BCC51.936AA275@fxtech.com> Message-ID: <384BD6A5.24A60C72@fxtech.com> > The major idea here is you register everything up-front, and > element-specific callbacks get called to deal with specific elements. > You can start up parsing inside an element, so you can nest parsing at > the object level. I should point out that I'm interested in a C++ implementation only. It seems the Java people already have anything XML they could ever want. :-) If there are others interested in what I proposed I'm prepared to whip up an implementation this week. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 15:44:39 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: First interface draft In-Reply-To: John Aldridge's message of "Mon, 06 Dec 1999 15:01:03 +0000" References: <3.0.6.32.19991206150103.009a1c10@mailhost> Message-ID: >>>>> John Aldridge : > We're using MSVC 6 here, and basic_string<> seems fine. It's not. See eg. http://msdn.microsoft.com/visualc/stl/faq.htm#Q4 There are patches to this and other problems and bugs with the Standard Library, to be found at http://www.dinkumware.com/vc_fixes.html but these fixes won't help with templates that are explicitly instantiated in the C++ runtime DLL. I spent two weeks before last christmas trying to lose Standards when using MSVC++, and I got to the stage where I was able to compile the program and run it a little bit before it crashed, before we decided to cut our losses and went back to Standards. This is a program that runs without incident on Sunpro 4.2+Standards, gcc/egsc on linux and MSVC++ with Standards. Complaints about this state of the Standard C++ library, are met with responses on the line of "MSVC++ is not a standard C++ compiler. It's a Windows compiler". Quite amazing, really. However, MS has indicated that MSVC++ 7 will may come out with a fixed version of the Standard C++ Library (but I'm not holding my breath waiting for this). > We use templates extensively (both the STL and our own), and they > too give little trouble _except_ when it comes to exporting template > instantiations across DLL boundaries, which takes considerable care > (but can usually be managed). It's OK if the instantiated classes don't have any static members. Then you run into having to do this: http://msdn.microsoft.com/visualc/stl/faq.htm#Q5 > Namespaces are fine too. Yes. That wasn't my problem. My problem was that std::iostreams are incompatible with Standards (a failing of Standards, I agree). I could make a stab at replacing Standards with stuff from SGI: http://www.stlport.org/doc/README.VC++.html But then it's a question of replacing stuff that works with stuff that maybe works. [snip!] > I think the days of having to avoid large chunks of the C++ standard > are largely over, thank heavens. In half a year, to a year, I expect I'll agree with you. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vilya at nag.co.uk Mon Dec 6 15:48:32 1999 From: vilya at nag.co.uk (Vilya Harvey) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++ vs. SAX2 References: <14408.2610.245842.199581@localhost.localdomain> <384B8C32.22805488@nag.co.uk> Message-ID: <384BDACE.44C13FED@nag.co.uk> Lars Marius Garshol wrote: > > * Vilya Harvey > | > | Just a thought: why not take a leaf out of the DOM's book and write > | the canonical version of the SAX interfaces in a language-neutral > | format like IDL? > > This may sound like a good idea, but it has its drawbacks in that one > is immediately forced into a lowest common denominator design where it > is impossible to make use of the features that really make each > language what they are. Ray Whitmer basically said what I wanted to say in his response to your post (thanks Ray!) so I won't repeat that; just consider me in agreement with the points he made. :-) > Also, IDL does not have convenient ways of mapping to C++ streams, > Java InputStream, Python dictionary-like objects and file-like objects > etc etc In theory though, you would simply define the functionality you required from a stream (to use your example) in an interface then make use of the appropriate "native" stream class in your implementation. That's not terribly inconvenient, and it needn't be inefficient either in a reasonable implementation. > Another problem is that exceptions are first-class objects in SAX > (which is exploited by the Java and Python mappings), but not in IDL. The only problem I see is that IDL doesn't allow exceptions to have inheritance. That would mean some slight changes to the API (although the implementations could still inherit from one another), but nothing really serious. Other than that, IDL only allows exceptions to declare member data (no methods); I don't see that as a real limitation though, since I have yet to see a *useful* example of an exception object with any methods other than getters/setters (which IDL member data gets mapped to) and printStackTrace(). > Nor are language naming conventions respected. (startElement should > really be startElement (in Java), start_element (in C++, Python, IDL) > and start-element (in Common Lisp/Scheme) and there may even be more > variations. Not everyone using a particular language follows the same naming conventions anyway, so I really don't think that should be a factor. As an aside, I disagree with you about the C++ name: I think it should be startElement not start_element. :-) Also as an aside, I haven't seen any IDL to Lisp or Scheme converters - does such a tool exist? > As a general reference and statement of intent it might have some > value, but I really think translation should be done by humans. I agree with you about this. > The main advantage feature of IDL, cross-process and cross-language > interoperability, is not really all that valuable for SAX anyway. I would argue that since SAX appears to be intended as a cross-language API, the main advantage of IDL would be its language neutrality. It would mean that the API would not be developed with the capabilities of one particular language in mind, as has happened with SAX 1.0. Of course whether or not that would be a real advantage is debatable but it would help avoid situations such as we currently have, where several incompatible C++ bindings have sprung up. Surely that's a good thing? Vil. (Not speaking for my employer.) -- Vilya Harvey Wilkinson House Mob: +44 961 106 505 Computational Mathematics Group Jordan Hill Road Wk: +44 1865 511 245 NAG Limited Oxford UK OX2 8DR Fax: +44 1865 311 205 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From czinkos at mail.matav.hu Mon Dec 6 15:51:35 1999 From: czinkos at mail.matav.hu (Zsolt Czinkos) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O Message-ID: <384BDF42.9BCA2C23@mail.matav.hu> Paul Miller wrote: > The major idea here is you register everything up-front, and > element-specific callbacks get called to deal with specific elements. Hello, With SAXON JAVA API you can define your own element-specific handlers. (Last version I had a look at was 4.5.) Best, Zsolt Czinkos xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Dec 6 15:55:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 09:10:42 -0500" References: <384B04DA.DCD6BAED@fxtech.com> <384BC3E2.E526B322@fxtech.com> Message-ID: Paul Miller writes: > Here is where I have the problem. This leaves an awful lot up to the > application, still, including handling the proper nesting. I would like > to make the actual parsing of elements more "automatic", so when a > certain element is hit, it calls a function with my object pointer where > I can pick up the parsing from there, then drop back out to the > enclosing XML scope and keep going. If you're using Java, then there are already some higher-level toolkits for this sort of thing -- you might want to take a look at SAXON, which is built on top of SAX. > Perhaps what I want to do should be built on SAX instead of expat, > though. That will make sense once we have a common SAX C++ interface for expat, libxml, rxp, xml4c++, and any other C-/C++-based XML parsers. We're not there yet, though. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 16:07:35 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: First interface draft Message-ID: <384BDEFC.FD92DBE0@fxtech.com> > It's not. See eg. > http://msdn.microsoft.com/visualc/stl/faq.htm#Q4 I believe this problem is due to Microsoft's basic_string being a reference-counting implementation. I've seen problems with this myself. > Complaints about this state of the Standard C++ library, are met with > responses on the line of "MSVC++ is not a standard C++ compiler. It's > a Windows compiler". > Quite amazing, really. Indeed, but it *is* possible to write portable code with MSVC++. It just depends on how much Microsoft stuff you drag into your build. One thing you might try is SGI's implementation of the STL (which includes their own version of std::string). I've been using this for years with much success. Download it at http://www.sgi.com/Technology/STL. Alex Stepanov works on it at SGI, so you know it's good. > Yes. That wasn't my problem. My problem was that std::iostreams are > incompatible with Standards (a failing of Standards, I personally don't use MS's iostreams. They are about 6-10 times slower than good ole' stdio, so I wrote my own basic IO streams classes that simply wrap around a FILE *. Much better. -Paul -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bhall at merrillhall.com Mon Dec 6 16:10:36 1999 From: bhall at merrillhall.com (Ben Hall) Date: Mon Jun 7 17:18:24 2004 Subject: PSGML-1.2.1 problems In-Reply-To: <384BCFC8.25F7721E@esatclear.ie> Message-ID: It appears that you define the role attribute in common.att, xlink-simple.att, common-idreq.att and call more than one of these in some elements. --Ben ==================================== benjamin hall merrill-hall new media, inc bhall@merrillhall.com 404.827.9883 ==================================== > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Eoin Lane > Sent: Monday, December 06, 1999 10:01 AM > To: xml-dev@ic.ac.uk > Subject: PSGML-1.2.1 problems > > > I'm trying to write a xml doc with emacs configured to use psgml-1.2.1 > but am having some problems. I have checked that psgml works with a > simple dtd. However when I use the dtd (document-v10.dtd) below I get > the following error. > > ~/character.ent line 2 col 12 entity common.att > ~/document-v10.dtd line 218 col 29 entity DOCUMENT > ~/installing.xml line 3 col 51 > Name expected; at: :lang > > I wonder could anyone tell me what I am doing wrong. I know the dtd is > correct because I checked it with IBM 4j parser and it validated. it > would be of great benefit to me if I could use the dtd in emacs so any > help would be greatly appreciated. > > Eoin. > > > -- > > Dr. Eoin Lane > InConn Technologies Ltd. > 17 Washington St. > Cork. > Tel. (021) 271855 Fax (021) 272419 > http://www.inconn.ie > mailto:eoinlane@esatclear.ie > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 16:21:28 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:24 2004 Subject: simple XML for C++ application data-file I/O In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 10:30:45 -0500" References: <384B04DA.DCD6BAED@fxtech.com> <384BCC51.936AA275@fxtech.com> <384BD6A5.24A60C72@fxtech.com> Message-ID: >>>>> Paul Miller : > If there are others interested in what I proposed I'm prepared to > whip up an implementation this week. I am interesting in something like this, but for me it would have something that would take its input as a SAX DocumentHandler. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Mon Dec 6 16:36:10 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: First interface draft References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> Message-ID: <01e701bf4007$da1a2b00$a20010ac@trivida.com> I think this problem usually can be managed by choice of include path, assuming you have source code for the SAX library. If the include directory for the ObjectSpace headers is found before the include directory for MSCVC++ standard library when you compile both the SAX library and application code, the problem with broken parts of the MSVC++ library version of iostream should be avoided. But this approach does not work if you want to use someone else's binary of the library. Jeff ----- Original Message ----- From: Steinar Bang To: Sent: Monday, December 06, 1999 1:09 AM Subject: Re: SAX/C++: First interface draft > I have a practical problem with using std::istream on the MSVC++ > platform. Since the Standard C++ Library as delivered with MSVC++ 5 > and 6 is broken, we're using Standards from ObjectSpace to > provide us with the parts of the Standard C++ Library we're using. > > And Objectspace Standards is not compatible with the Standard > C++ Library iostreams of MSVC++. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Mon Dec 6 16:49:41 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:24 2004 Subject: SAX/C++: First interface draft In-Reply-To: "Jeff Greif"'s message of "Mon, 6 Dec 1999 08:35:05 -0800" References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <01e701bf4007$da1a2b00$a20010ac@trivida.com> Message-ID: >>>>> "Jeff Greif" : > I think this problem usually can be managed by choice of include > path, assuming you have source code for the SAX library. If the > include directory for the ObjectSpace headers is found before the > include directory for MSCVC++ standard library when you compile both > the SAX library and application code, the problem with broken parts > of the MSVC++ library version of iostream should be avoided. But > this approach does not work if you want to use someone else's binary > of the library. No, this is not the problem. The problem is that Standards does not work with the standard C++ library iostream implementation of MSVC++. This implementation is in the std:: namespace. Instead it works with the old iostreams which are _not_ in the std:: namespace. Ie. if I use std::istream, I get the incompatible iostreams. The problem can _maybe_ be solved by ditching Standards in favour of the STLport version of SGI STL. But suggesting to my coworkers that I spend time on this, when the current Standards setup is working, would be met with incomprehension. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Dec 6 18:09:12 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:18:24 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <33D189919E89D311814C00805F1991F7F4A98C@RED-MSG-08> Dan Brickley wrote: > I believe it will be possible to annotate XML schemas with > information > for mapping into (generic or domain specific) application datamodels > such as RDF. I don't think it is right to expect the hard-pressed XML > Schema group to define all these mappings within that working group. I agree. There are probably many ways to express mappings. One candidate is shown at the end of the "Schemas NG" paper. See http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm, and look for the section titled "Mapping to Other Data Models." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Dec 6 19:41:45 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:18:25 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: <33D189919E89D311814C00805F1991F7F4A98C@RED-MSG-08> Message-ID: On Mon, 6 Dec 1999, Andrew Layman wrote: > Dan Brickley wrote: > > I believe it will be possible to annotate XML schemas with > > information > > for mapping into (generic or domain specific) application datamodels > > such as RDF. I don't think it is right to expect the hard-pressed XML > > Schema group to define all these mappings within that working group. > > I agree. There are probably many ways to express mappings. One candidate is > shown at the end of the "Schemas NG" paper. See > http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm, and look > for the section titled "Mapping to Other Data Models." This is interesting work, though it's unclear quite how it fits in with the Canonical Format / Serializing Graphs paper. The 'Mapping to Other Data Models' section of 'Schemas NG' shows one strategy for annotating schemas to support directed labelled graph interchange in XML. It would be good to see these two strategies drawn together in a single document describing objects'n'properties DLG serialization strategies for XML applications. By drawn together I mean having a common documented model for the DLG representation rather than informal prose. It is clear by now that the RDF 1.0 Syntax doesn't cut it as the One True Graph Serialization for all XML applications. I don't think anybody expected otherwise, but we now have general consensus [eg. 1] that a more broadly usable DLG exchange syntax is needed by RDF apps. We have two proposals already floated on the RDF Interest Group for alternate DLG-interchange syntaxes [2, 3] and their aims seem to be basically the same as [4,5]: DLG interchange in XML. It is also clear that a lot of (RDF-agnostic) XML data interchange apps want to ship directed labelled graphs around using non-stilted XML syntaxes. I've argued elsewhere [7] that these graphs will often want to use URIs for edge types, node identifiers and node types in all but tightly-coupled closed environments. My hope is that XML-DEV and the RDF Interest Group[6] will come up with implementation-led proposals for XML DLG-interchange that both complement the XML Schema work (for mapping-based proposals) and fit with colloquial (ie. mainstream) XML conventions (for serialisation syntaxes). There's a bunch of interest in an improved syntax for RDF graph serialization, and growing interest in more general XML DLG interchange strategies layered on top of XML + XML Schemas. I have a hard time thinking of these as different problems, hence my wish that the DLG model mentioned in the schemas NG and canonical papers be documented a bit more formally to aid comparison with similar proposals for a better RDF syntax... Dan Refs: [1] http://www.w3.org/TR/1999/NOTE-schema-arch-19991007 (s3.8) [2] http://lists.w3.org/Archives/Public/www-rdf-interest/1999Nov/0066.html [3] http://lists.w3.org/Archives/Public/www-rdf-interest/1999Nov/0100.html [4] http://www.biztalk.org/Resources/canonical.asp [5] http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm#_ftn4 [6] http://www.w3.org/RDF/Interest/ [7] http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Dec-1999/0121.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clefebvre at advance-groupe.com Mon Dec 6 19:58:30 1999 From: clefebvre at advance-groupe.com (Christophe Lefebvre) Date: Mon Jun 7 17:18:25 2004 Subject: No subject References: <14408.2610.245842.199581@localhost.localdomain> <384B8C32.22805488@nag.co.uk> Message-ID: <384C14CE.57E8A74C@advance-groupe.com> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Dec 6 20:10:17 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:18:25 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <33D189919E89D311814C00805F1991F7F4A9A5@RED-MSG-08> Thanks. As a recap: There are, broadly, two approaches to serializing a graph in XML. One is to invent a meta-grammar, a set of canonicalization rules. That is what RDF syntax did, and what the attribute-centric and element-centric canonical format papers do, what SOAP section eight does. I think of this as "tunnelling the graph through XML." The other is to allow XML documents to follow any pattern described in a schema, and augmenting the schema with a set of mapping rules. There appears to be significant value to each approach. (In particular, however, I disagree with the sometimes-asserted claim that graphs capture the semantics of a communication while grammars do not. Graphs are just another grammar. This makes me reluctant to deprecate grammars.) I agree that formal approaches to mapping would be helpful. I look forward to reading your papers. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jefftr at bellsouth.net Mon Dec 6 20:29:10 1999 From: jefftr at bellsouth.net (Jeff Russell) Date: Mon Jun 7 17:18:25 2004 Subject: GUI XML doc authoring tools Message-ID: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net> Anybody know of any Windows GUI (or Linux, as a last resort) XML document authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a DTD. Jeff Russell jefftr@bellsouth.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Dec 6 20:45:53 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: Message-ID: <3886141114.944483978@MDAXKE> A danger with adopting the convention that 1 of the 0 or more output parameters is a return value is that it may interfere with a later convention on error handling. I haven't seen that discussed yet in your design principles. Unlike java or perl, exceptions in C++ are a bit of a land mine, and could also risk destroying any simple interop with a straight C library, either above or below. Not to mention the fact that there is no standard for cross-language exception raising. choices seem to be: - return an error code - return a boolean success/failure - use C++ exceptions - call an error handler and return 0 (which may not get run if the error handler aborts) - some combination of the above, configurable by the programmer btw, i'd like to register an objection to reference args. they make code reading a bit of pain because you cannot tell from the call whether a copy constructor is going to be used or not -- you always have to go hunt up the .h. with a pointer arg, it is always clear. and in regards the character type question, that is a bit awkward because a key goal for many programmers will be to use the "native" string type used by the parser, which may be just linked in binary -- not recompiled. of course, if we all just use expat, that is solved -- we have to have a SAX/C++ type which directly points to expat's strings. -mda --On Friday, December 03, 1999 8:58 AM -0500 David Megginson wrote: > James Clark writes: > >> That's problematic for EntityResolve::resolveEntity; that requires that >> ownership of an InputSource be transferred from to the caller from the >> callee. >> >> This could be avoided by doing: >> >> virtual const InputSource * >> resolveEntity(const char *publicId, >> const char *systemId); >> >> instead of: >> >> virtual void >> resolveEntity(const char *publicId, >> const char *systemId, >> InputSource &inputSource); > > (I'll assume that James accidentally reversed the two). The second > one is a very good idea -- the only modification I'd make is to add a > bool return value, so that the parser knows whether the resolver > actually wants to override: > > virtual bool > resolveEntity(const char *publicId, > const char *systemId, > InputSource &inputSource); > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Mon Dec 6 21:07:43 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles References: <3886141114.944483978@MDAXKE> Message-ID: <384C255C.6419E17@fxtech.com> > - use C++ exceptions I vote for C++ exceptions. That is why they are there. > btw, i'd like to register an objection to reference args. they make > code reading a bit of pain because you cannot tell from the call whether > a copy constructor is going to be used or not -- you always have to > go hunt up the .h. with a pointer arg, it is always clear. If you always pass by reference, this isn't a problem. In C++, there is almost never a compelling reason to pass objects by value. Always using references is nice because you can tell by looking at the prototype whether the argument is optional (if it's a pointer, it's optional). If you always use pointers you have to read the documentation to find out. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dhunter at Mobility.com Mon Dec 6 22:25:56 1999 From: dhunter at Mobility.com (Hunter, David) Date: Mon Jun 7 17:18:25 2004 Subject: A question on nomenclature Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com> A simple question. What is that? My choices so far: -an "application of XML", or possibly just "application", although this would cause confusion with "application" as defined in the spec. -a "vocabulary" (the one I personally use, although I may change after this thread...) -a "grammar" Keep in mind I'm talking about the "structure" there, not the "instance" of that "structure". (I want to describe the "class", not the "object".) I have a feeling that there isn't a real consensus anywhere, and that different people are using different names. (Are there any others? Do people use "format", or something along those lines? Or "class"?) It's not something that I would ever have to worry about when using XML in my applications, but if I were to, oh, I don't know, write a book about XML, I'd want to create as little confusion as possible, so would I be safe in calling the structure I created a "vocabulary"? Do things get hairier if we get into formats documented in DTDs/Schemas, and documents with no DTD or Schema, or does the nomenclature stay the same? Any thoughts or opinions would be appreciated. Any documentation that I've missed which states emphatically "this is what you would call it" would be even more appreciated, but I don't think it's out there... David Hunter MobileQ david.hunter@mobileq.com http://www.MobileQ.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Dec 6 22:19:11 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: <384C255C.6419E17@fxtech.com> Message-ID: <3891840459.944489678@MDAXKE> >> - use C++ exceptions > > I vote for C++ exceptions. That is why they are there. Someone should dictate whether the exception objects are raised, or pointers to them. Regardless, it is impossible for mere mortals to use them without having leaks when they occur below constructors and destructors. But I guess anyone using C/C++ already knows they are taking such risks. Don't get me wrong; i like exceptions in programming languages that support them well. I'm a little confused by the intent of the draft header, where there is a SAXParseException class which is an argument to a handler. Seems like if it is a native C++ exception, then the caller takes care of catching it, not registering a handler. I also wonder whether a handler (error handler or any other, like document) is supposed to be able to call back into the Parser and tell it clean up. I also might note that the current exception class appears to have no member data indicating which parser or inputsource object is in use, which would be an issue with a multi-threaded implementation, or even a single-threaded one with multiple top-level instances. > >> btw, i'd like to register an objection to reference args. they make >> code reading a bit of pain because you cannot tell from the call whether >> a copy constructor is going to be used or not -- you always have to >> go hunt up the .h. with a pointer arg, it is always clear. > > If you always pass by reference, this isn't a problem. In C++, there is > almost never a compelling reason to pass objects by value. Agreed. I guess it comes down to how much you trust other programmers. If you trust them, then using pointerhood to encode optionality might be useful. I guess I'm just too often forced to deal with C++ afficionados who love nothing more than hiding several automatic class methods and casts in every argument value. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mlepage at antimeta.com Mon Dec 6 23:01:16 1999 From: mlepage at antimeta.com (mlepage@antimeta.com) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: <3891840459.944489678@MDAXKE>; from mda@discerning.com on Mon, Dec 06, 1999 at 02:14:38PM -0800 References: <384C255C.6419E17@fxtech.com> <3891840459.944489678@MDAXKE> Message-ID: <19991206180059.A24592@antimeta.com> On Mon, Dec 06, 1999 at 02:14:38PM -0800, Mark D. Anderson wrote: > > >> - use C++ exceptions > > > > I vote for C++ exceptions. That is why they are there. > > Someone should dictate whether the exception objects are raised, > or pointers to them. Regardless, it is impossible for mere mortals > to use them without having leaks when they occur below constructors > and destructors. But I guess anyone using C/C++ already knows they > are taking such risks. > > Don't get me wrong; i like exceptions in programming languages that > support them well. In C++, you throw exceptions by value and catch them by reference (see Meyers for details). So the exceptions themselves don't leak. Since fully constructed objects are destructed during stack unwinding, there are no leaks there. If you are doing things using pointers, etc., where allocated resources are not automatically freed (i.e. the pointer is freed but not the pointee), then yes you risk memory leaks. However, you should be using auto_ptr and other helpers to avoid that problem. This technique is discussed at length in Stroustrup and Meyers. So assuming you are using the helpers made available for you, properly, there are no memory leaks there. Sutter's new book Exceptional C++ is just out, and details even more regarding exception safety, I presume. -- Marc Lepage (aka SEGV) http://www.antimeta.com/ RTS game programming info, Minion open source game, etc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Tue Dec 7 00:08:11 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: <19991206180059.A24592@antimeta.com> Message-ID: <3898269654.944496107@MDAXKE> I'm familiar with the solutions; it is just that i'd rather not have to trust that every other C++ programmer has memorized Meyer. Nor would I want to impose smart pointers, which are imho right up there with vector and string classes in ways-to-learn-to-hate-c++. if in fact SAX decides to support C++ exceptions rather than an error handler, it would probably help to just give some examples that clarify correct usage of the exception classes, for the non-cognoscenti. -mda --On Monday, December 06, 1999 6:00 PM -0500 mlepage@antimeta.com wrote: > On Mon, Dec 06, 1999 at 02:14:38PM -0800, Mark D. Anderson wrote: >> >> >> - use C++ exceptions >> > >> > I vote for C++ exceptions. That is why they are there. >> >> Someone should dictate whether the exception objects are raised, >> or pointers to them. Regardless, it is impossible for mere mortals >> to use them without having leaks when they occur below constructors >> and destructors. But I guess anyone using C/C++ already knows they >> are taking such risks. >> >> Don't get me wrong; i like exceptions in programming languages that >> support them well. > > In C++, you throw exceptions by value and catch them by reference (see Meyers for details). So the exceptions themselves don't leak. > > Since fully constructed objects are destructed during stack unwinding, there are no leaks there. > > If you are doing things using pointers, etc., where allocated resources are not automatically freed (i.e. the pointer is freed but not the pointee), then yes you risk memory leaks. However, you should be using auto_ptr and other helpers to avoid that problem. This technique is discussed at length in Stroustrup and Meyers. So assuming you are using the helpers made available for you, properly, there are no memory leaks there. > > Sutter's new book Exceptional C++ is just out, and details even more regarding exception safety, I presume. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jefftr at bellsouth.net Tue Dec 7 01:45:17 1999 From: jefftr at bellsouth.net (Jeff Russell) Date: Mon Jun 7 17:18:25 2004 Subject: A question on nomenclature In-Reply-To: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com> Message-ID: <000a01bf4054$57347770$90fc4dd8@bhm.bellsouth.net> Different people are describing it different ways. An "application of XML" would be generic enough. "Application" would be one step higher, and so also technically correct. "Format" would describe a particular document or specific DTD/schema. Grammar or syntax is what the XML spec describes. A vocabulary might be the specific "proprietary" set of tags used in a given deocument, DTD, or schema. "Class" is a technical word from XSL and CSS. Jeff Russell |-----Original Message----- |From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of |Hunter, David |Sent: Monday, December 06, 1999 4:26 PM |To: 'XML-dev' |Subject: A question on nomenclature | | | | | | | | |A simple question. What is that? | |My choices so far: |-an "application of XML", or possibly just "application", although this |would cause confusion with "application" as defined in the spec. |-a "vocabulary" (the one I personally use, although I may change after this |thread...) |-a "grammar" | |Keep in mind I'm talking about the "structure" there, not the "instance" of |that "structure". (I want to describe the "class", not the "object".) I |have a feeling that there isn't a real consensus anywhere, and that |different people are using different names. (Are there any others? Do |people use "format", or something along those lines? Or "class"?) | |It's not something that I would ever have to worry about when using XML in |my applications, but if I were to, oh, I don't know, write a book about XML, |I'd want to create as little confusion as possible, so would I be safe in |calling the structure I created a "vocabulary"? Do things get hairier if we |get into formats documented in DTDs/Schemas, and documents with no DTD or |Schema, or does the nomenclature stay the same? | |Any thoughts or opinions would be appreciated. Any documentation that I've |missed which states emphatically "this is what you would call it" would be |even more appreciated, but I don't think it's out there... | |David Hunter |MobileQ |david.hunter@mobileq.com |http://www.MobileQ.com | |xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk |Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on |CD-ROM/ISBN 981-02-3594-1 |To unsubscribe, mailto:majordomo@ic.ac.uk the following message; |unsubscribe xml-dev |To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; |subscribe xml-dev-digest |List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) | | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Dec 7 05:03:14 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:25 2004 Subject: A question on nomenclature References: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com> Message-ID: <016501bf4070$415d1f80$0300000a@cygnus.uwa.edu.au> > > > > > > > A simple question. What is that? > > My choices so far: > -an "application of XML", or possibly just "application", although this > would cause confusion with "application" as defined in the spec. > -a "vocabulary" (the one I personally use, although I may change after this > thread...) > -a "grammar" It's a "schema". It happens to be in a particular schema language that is kind of "schema-by-example" but it is a schema nevertheless. Schemas are grammars so it's a "grammar" too. I tend to use the term "vocabulary" to mean a set of element type (and their attribute) names not necessarily with defined content specifications. However, many people use "vocabulary" to mean the same as "schema" and "grammar". My AUD0.02 James Tauber xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Tue Dec 7 07:30:01 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 16:06:36 -0500" References: <3886141114.944483978@MDAXKE> <384C255C.6419E17@fxtech.com> Message-ID: >>>>> Paul Miller : >> - use C++ exceptions > I vote for C++ exceptions. That is why they are there. Personally I think that C++ exceptions should only be used to signal critical situations for the future execution of the program, not as a normal matter of program flow control. However, I think the ErrorHandler is a good idea, and have no problems with exceptions being thrown for the cases where one haven defined and registered an ErrorHandler. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Tue Dec 7 07:27:43 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:25 2004 Subject: SAX/C++: C++-specific design principles In-Reply-To: "Mark D. Anderson"'s message of "Mon, 06 Dec 1999 12:39:38 -0800" References: <3886141114.944483978@MDAXKE> Message-ID: >>>>> "Mark D. Anderson" : > Unlike java or perl, exceptions in C++ are a bit of a land mine, [snip!] Se Items 9 through 15, and in particular Item 15 "Understand the costs of exception handling", in Scott Meyers' "More Effective C++" http://www.awl.com/cseng/titles/0-201-63371-X/ for more detail on this. > Not to mention the fact that there is no standard for cross-language > exception raising. > choices seem to be: > - return an error code > - return a boolean success/failure > - use C++ exceptions > - call an error handler and return 0 (which may not get run if the > error handler aborts) > - some combination of the above, configurable by the programmer Personally I'm partial to allocate the "return value" in the caller, and give a reference argument to this value and return a status code, rather than returning the value itself, e.g. bool getValue(int index, string& value); rather than const string& getValue(int index); The syntax is more clumsy, but the memory management is easier (I'm also partial to allocate objects on the stack in the caller, rather than doing new). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Tue Dec 7 08:23:12 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:25 2004 Subject: GUI XML doc authoring tools In-Reply-To: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net> References: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net> Message-ID: * Jeff Russell | | Anybody know of any Windows GUI (or Linux, as a last resort) XML | document authoring tools? Something like SoftQuad's XMeTaL, but that | doesn't require a DTD. You can find a list of free XML editors for all platforms at --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Tue Dec 7 08:41:50 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:25 2004 Subject: SAX and Message-ID: Is there something in SAX that gives the information in the declaration to the application? I've scratched my head over DTDHandler http://www.megginson.com/SAX/javadoc/org.xml.sax.DTDHandler.html but I haven't found anything that looks like it there. I've looked at the source code for XMLNorm/XMLWriter, and it doesn't look like it is anywhere there. The top level element is used as the name. But when looking at the original announcement for XMLNorm: http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Jul-1999/0346.html there's a link to an XML document that has a DOCTYPE with both a PUBLIC identifier and a SYSTEM identifier http://home.sprynet.com/~dmeggins/texts/darkness/darkness.xml Or maybe the above document is _before_ processing with XMLNorm...? Hm... searches on the net also gave me some discussions from January 1998, that seems to indicate that a declaration isn't good enough as a document type identifier, but there didn't seem to be any conclusions (at least I didn't find them. What I need the document type for, is to set the appropriate DocumentHandler in the parser. I'm assuming that this is something others would like to do. If there is no DOCTYPE information available, how is it done? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Tue Dec 7 09:46:48 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:25 2004 Subject: SAX and In-Reply-To: References: Message-ID: * Steinar Bang | | Is there something in SAX that gives the information in the | declaration to the application? Not in 1.0, as this was considered lexical rather than logical information. (It's optional in the infoset WD.) | I've scratched my head over DTDHandler | http://www.megginson.com/SAX/javadoc/org.xml.sax.DTDHandler.html | but I haven't found anything that looks like it there. DTDHandler serves a very narrow purpose: to pass on to the application exactly what the XML rec requires processors to pass on. SAX 2.0, however, does have this in the LexicalHandler.startDTD callback. | What I need the document type for, is to set the appropriate | DocumentHandler in the parser. I'm assuming that this is something | others would like to do. If there is no DOCTYPE information | available, how is it done? It depends on the situation. In the XSA client, which needs to accept both XSA and OSD documents, but can't tell them apart before parsing begins, uses a DispatchingDocHandler, which has a hash of DocumentHandlers keyed on the name of the document element. In this very restricted case that worked just fine. In other cases one might perhaps key on the namespace of the document element, and with SAX 2 one could use the public identifier of the DOCTYPE declaration. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Dec 7 13:37:22 1999 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:18:25 2004 Subject: GUI XML doc authoring tools In-Reply-To: "Jeff Russell"'s message of "Mon, 6 Dec 1999 14:26:48 -0600" References: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net> Message-ID: "Jeff Russell" writes: > Anybody know of any Windows GUI (or Linux, as a last resort) XML document > authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a > DTD. I like XED [1], but that's not too surprising: I wrote it :-). ht [1] http://www.ltg.ed.ac.uk/~ht/xed.html -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Tue Dec 7 16:38:39 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:25 2004 Subject: nestable C/C++ XML parser? Message-ID: <384D3843.F3F7AFB9@fxtech.com> I'm trying to develop a tag-based front-end to expat and having no luck. I'd like to be able to parse an XML document in nestable chunks, by calling into a nestable parser. In other words, I'd like to start parsing, then branch to a function to handle a specific element, parsing in there until that element is closed, then fall back out of the function to continue parsing the rest of the document. Something like this: ParseDocument (call HandleFoo when Foo element is found) HandleFoo() ParseFoo // do something with Foo stuff here FinishParseDocument Has anyone seen such a beast? -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Tue Dec 7 17:11:17 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:25 2004 Subject: nestable C/C++ XML parser? In-Reply-To: <384D3843.F3F7AFB9@fxtech.com> References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: * Paul Miller | | In other words, I'd like to start parsing, then branch to a function | to handle a specific element, parsing in there until that element is | closed, then fall back out of the function to continue parsing the | rest of the document. More people than you have been asking for this, but this is quite simply not the way XML is meant to work. XML is a standardized syntax, and because of that it makes no sense to let application developers do part of the parsing, since they are likely to get parts of it wrong and since the syntax is standardized there is no reason not to let the parser handle it for you. (You would in any case only duplicate its standard-decreed way of parsing.) The only application I see for this sort of thing is to be able to work around XML syntax rules, but once you do that your document is no longer an XML document and you shouldn't pretend that it is, not even to yourself. (Imagine what happens when an XML repository, XML editor, XML browser or an XSLT engine tries to work with your "XML" document.) In other words, when you find yourself doing this you should very likely explain why to experienced XML developers and then ask them how one usually handles this sort of thing _within_ XML, or else abandon any pretense of using XML entirely. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From schen at falconwing.com Tue Dec 7 17:19:32 1999 From: schen at falconwing.com (Sean Chen) Date: Mon Jun 7 17:18:25 2004 Subject: GUI XML doc authoring tools In-Reply-To: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net> Message-ID: Hi Jeff, everyone, On Mon, 6 Dec 1999, Jeff Russell wrote: > Anybody know of any Windows GUI (or Linux, as a last resort) XML document > authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a > DTD. You can try my Java-based Athame XML editor, which is currently in early stages of development. It's main feature is XSLT support using James Clark's XT. I've used it to write a couple hundred pages of courseware but it's rough on the edges. http://falconwing.com/~schen/ It comes bundled with the DocBk XML and XSLT stylesheets. . . . Sean. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Toby.Speight at streapadair.freeserve.co.uk Tue Dec 7 17:25:35 1999 From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight) Date: Mon Jun 7 17:18:25 2004 Subject: nestable C/C++ XML parser? In-Reply-To: Lars Marius Garshol's message of "07 Dec 1999 18:11:08 +0100" References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: Lars> Lars Marius Garshol 0> In article , Lars wrote: Lars> The only application I see for this sort of thing is to be able Lars> to work around XML syntax rules, I see a demand for parsing a document with SAX, but using some start-tags to switch to building DOM (or DOM-like) objects, returning to stream-oriented processing afterwards. Perhaps you have a large "set" or "list", and you know that the members of that collection can be processed independently - why waste memory on a complete DOM for that? Lars> but once you do that your document is no longer an XML document Lars> and you shouldn't pretend that it is, not even to yourself. This bit I agree with. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Tue Dec 7 17:26:48 1999 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:18:25 2004 Subject: A question on nomenclature Message-ID: <51ED3F5356D8D011A0B1006097C3073401B1702F@martinique> > -----Original Message----- > From: Hunter, David [mailto:dhunter@Mobility.com] > Sent: Monday, December 06, 1999 4:26 PM > > > > > > > > A simple question. What is that? > It's a string. If you view it from the perspective of formal grammar, then it's a sentence in any language whose grammar defines it as such; an infinite number such grammars are definable using XML DTDs. But it is also a sentence in a language whose grammar stipulates that all sentences sandwich "dd" between any two strings. Plus an infinite number of other languages, including the one whose only legal sentence is just that string. > My choices so far: > -an "application of XML", or possibly just "application", > although this > would cause confusion with "application" as defined in the spec. Right. "XML Application" is marketing weaselspeak. There are no XML applications, only languages (grammars) defined using XML. (Ever hear of an "SQL application"?) > -a "vocabulary" (the one I personally use, although I may > change after this > thread...) Makes a certain intuitive sense, but I'd say vocab is better left to mean the words instead of the sentences - i.e., it's tied up with the concepts being modeled, in this case various kinds of names. > -a "grammar" > Nope. Grammars is rules. What you've written doesn't express any rules; you've got to have a metalanguage to have a grammar, too. > Keep in mind I'm talking about the "structure" there, not the > "instance" of > that "structure". (I want to describe the "class", not the > "object".) I Not sure what you mean. I take it you're after the structural "interpretation", as it were, of the instance. > I'd want to create as little confusion as possible, so would > I be safe in > calling the structure I created a "vocabulary"? Do things I think you'd run into trouble eventually, since one generally uses tagnames with a recognizable meaning in ordinary discourse. So you'd end up with "register confusion": uncertainty about when "vocab" means formal grammatical structures, and when it means the semantic realities being modeled by those structures. > > Any thoughts or opinions would be appreciated. Any > documentation that I've > missed which states emphatically "this is what you would call > it" would be > even more appreciated, but I don't think it's out there... Assuming you're interested in Truth and Clarity, I'd look in the section on formal languages and mathematical logic, and avoid industry-generated stuff, which tends to be rather solipsistic. Stoy's classic "Denotational Semantics" (you can get it through Amazon etc.) is very helpful in clarifying the relationship between syntax and "meaning". Also try Spivey's ZRM (http://spivey.oriel.ox.ac.uk/~mike/zrm/). Neither of these directly deals with XML, but XML is a specific case of a more general phenomenon; reading those two works in particular was a huge help for me at least in understanding the foundations of XMLdom. Caveat: when you hear the word "architecture", reach for your revolver. -gregg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhelton at rhythms.net Tue Dec 7 17:43:48 1999 From: rhelton at rhythms.net (rhelton@rhythms.net) Date: Mon Jun 7 17:18:25 2004 Subject: No subject Message-ID: <916BA3451A99D2118FCC0090272ABD2F031073C7@CAXIXI> unsubscribe --Rich Helton-- Rhythms EAI Architecture ext 2913 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhelton at rhythms.net Tue Dec 7 17:46:00 1999 From: rhelton at rhythms.net (rhelton@rhythms.net) Date: Mon Jun 7 17:18:25 2004 Subject: No subject Message-ID: <916BA3451A99D2118FCC0090272ABD2F031073C9@CAXIXI> unsubscribe --Rich Helton-- Rhythms EAI Architecture ext 2913 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Tue Dec 7 17:53:26 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:25 2004 Subject: nestable C/C++ XML parser? In-Reply-To: References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: * Lars Marius Garshol | | The only application I see for this sort of thing is to be able to | work around XML syntax rules, * Toby Speight | | I see a demand for parsing a document with SAX, but using some | start-tags to switch to building DOM (or DOM-like) objects, returning | to stream-oriented processing afterwards. Sure, I too see a need for this, and I've even implemented it. However, this is something completely different from doing parsing on behalf of the parser. Parsing is turning a stream of bytes (or characters) into something higher-level, but this is not what you are talking about. As far as I understood him, the original poster wanted to do the parsing (that is, the reading and interpretation of bytes/chars) on behalf of expat. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Tue Dec 7 17:55:12 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:25 2004 Subject: A processing instruction for robots In-Reply-To: <001201bf3e1f$074601c0$099918d1@docuverse1> References: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com> Message-ID: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> At 10:12 PM 12/3/99 -0800, Don Park wrote: >Walter, > >Could you elaborate your decision to use PI rather than >element(s)? Lars did a pretty good job, but I'll elaborate anyway. This is information for a specific kind of XML processor (an indexing robot), but it is not specific to the document type. So we need a mechanism that applies to any XML document and can be automatically ignored by non-robot processors. A PI is an exact fit. Even the name is right -- it is an instruction to the robot about how to process it. The alternative, adding an element to every DTD in the universe, with the corresponding breakage to every processor that reads those DTDs, is just too awful to contemplate. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Tue Dec 7 18:10:49 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:25 2004 Subject: A processing instruction for robots In-Reply-To: References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> Message-ID: <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com> At 10:25 AM 12/6/99 +0100, Lars Marius Garshol wrote: > >First thought: this is fine for very simple uses, but for more complex >uses something along the lines of the robots.txt file would be very >nice. How about a variant PI that can point to a robots.rdf resource? Two reasons, one based on keeping it very simple for authors, and one on keeping it simple for robot implementors. In our experience, the simple form covers almost all needs. We have 1000+ customers, and only three or four of them use our selective indexing support. So, I think of the robots meta tag as a proven solution that doesn't need major improvement. Secondly, fetching two or more entities for one document makes the robot code much more complex. If the robots.rdf file gets a 404, what happens? What about a 401 or a timeout? The robot may need separate last-modified dates and revisit times for each entity. And after it is implemented and tested, how do you explain all that to customers who just want search results? wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vilya at nag.co.uk Tue Dec 7 18:17:00 1999 From: vilya at nag.co.uk (Vilya Harvey) Date: Mon Jun 7 17:18:25 2004 Subject: nestable C/C++ XML parser? References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: <384D4F32.F1CC0BC0@nag.co.uk> Lars Marius Garshol wrote: > > * Lars Marius Garshol > | > | The only application I see for this sort of thing is to be able to > | work around XML syntax rules, > > * Toby Speight > | > | I see a demand for parsing a document with SAX, but using some > | start-tags to switch to building DOM (or DOM-like) objects, returning > | to stream-oriented processing afterwards. > > Sure, I too see a need for this, and I've even implemented it. > However, this is something completely different from doing parsing on > behalf of the parser. Parsing is turning a stream of bytes (or > characters) into something higher-level, but this is not what you are > talking about. Not exactly right. Parsing deals with a sequence of *tokens*; in the programming world these tokens are usually the result of lexical analysis of a sequence of characters, but they don't *have* to be. The tokens in question could be XML entities, for example... > As far as I understood him, the original poster wanted to do the > parsing (that is, the reading and interpretation of bytes/chars) on > behalf of expat. I think there has been some miscommunication due to the fact that there are really two distinct levels of parsing that can take place with XML. There is the parsing which turns a sequence of characters in some encoding into a particular XML entity or sequence of entities; and then there is the parsing which interprets a sequence of XML tokens to derive some application- or domain-specific meaning. I suspect it may have been the second type of parsing that the original poster was referring to. Bye, Vil. (Not speaking for my employer.) -- Vilya Harvey Wilkinson House Mob: +44 961 106 505 Computational Mathematics Group Jordan Hill Road Wk: +44 1865 511 245 NAG Limited Oxford UK OX2 8DR Fax: +44 1865 311 205 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Tue Dec 7 18:15:53 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:25 2004 Subject: A processing instruction for robots In-Reply-To: References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> Message-ID: <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com> At 10:25 AM 12/6/99 +0100, Lars Marius Garshol wrote: > >Second thought: "and the index attribute must be first". This is nice >for implementors, but is likely to clash with the expectations of >users and the cost of more generality is very low for implementors. I'm open to changing this, but I thought I would start with the most strict version. The advantage of the strict version is that it doesn't need to be parsed. The Desparate Perl Hacker can do four regex compares for the four variants and get back to work. Maybe folks who've worked with authors on SGML systems have some relevant experience. Is this too strict for folks that aren't tamed by computers? wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Tue Dec 7 18:19:21 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: <384D4FE7.C81F2CCE@fxtech.com> > Sure, I too see a need for this, and I've even implemented it. > However, this is something completely different from doing parsing on > behalf of the parser. Parsing is turning a stream of bytes (or > characters) into something higher-level, but this is not what you are > talking about. > > As far as I understood him, the original poster wanted to do the > parsing (that is, the reading and interpretation of bytes/chars) on > behalf of expat. This is more or less correct. I want to use XML as an application data file format. Why? Two primary reasons: 1. I don't need/want to invent a new syntax - I like XML just fine and it handles object-oriented nesting of data quite nicely 2. I can publish a DTD and make it easier for my end-users to use my application data in their own applications (I work in special effects applications, and certain high-end customers like to use my data in their own custom tools) without doing a lot of hand-holding Whether this constitutes a "good enough" reason to use XML I don't know. The primary use of XML seems to be web-oriented e-commerse stuff, of which I don't give a hoot about (I'll leave that stuff to the web experts). Given my needs, I know the data in the XML file, and I know what to do with it once I get to it. But I *do not* want to go with the huge complexity of DOM. I've indicated in a previous thread the kind of API I'd like to access the data. I was hoping expat would let me do nested parsing, but it doesn't. Frankly, for this kind of application file format stuff, validation and namespaces probably aren't really necessary, but I want to use the XML syntax mostly because it's well defined. This means I'll probably have to implement my own restartable low-level "parser" which just deals with the syntax and the basics. I was hoping to layer on expat, for no other reason than to gain free validation once expat gets it, but considering my needs this probably isn't necessary, and it's just a bit more work for me. >From this (and other discussions) it looks like this type of XML parser for application data would be generally useful (in the C/C++ community), so I'll be sure to make my efforts available. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dhunter at Mobility.com Tue Dec 7 18:22:29 1999 From: dhunter at Mobility.com (Hunter, David) Date: Mon Jun 7 17:18:26 2004 Subject: A question on nomenclature Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC017D@cc20exch2.mobility.com> From: Reynolds, Gregg [mailto:greynolds@datalogics.com] Sent: Tuesday, December 07, 1999 12:26 PM > > Keep in mind I'm talking about the "structure" there, not the > > "instance" of > > that "structure". (I want to describe the "class", not the > > "object".) > > Not sure what you mean. I take it you're after the structural > "interpretation", as it were, of the instance. Sorry, I'll try to state my case better. :-) I mean that I want to describe a "class" of XML documents, in which the root element will be , and will have three child elements, , , and . I'm describing that class by writing this: (James Tauber described this as a "schema-by-example") but what I really want is the name that I would call that "class" of XML documents. I may end up with an "instance" of that class, which happens to look exactly like that thing above, because it has no information for , , or , but I don't care about that. I'm still leaning toward "vocabulary", because that still seems to describe it best, but I'm still open too. (I think "schema" is probably correct for what I'm trying to do as well, but that would confuse readers with XML Schemas, which are just one type of "schema description language"...) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ebohlman at netcom.com Tue Dec 7 18:45:55 1999 From: ebohlman at netcom.com (Eric Bohlman) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? In-Reply-To: Message-ID: On 7 Dec 1999, Lars Marius Garshol wrote: > Sure, I too see a need for this, and I've even implemented it. > However, this is something completely different from doing parsing on > behalf of the parser. Parsing is turning a stream of bytes (or > characters) into something higher-level, but this is not what you are > talking about. > > As far as I understood him, the original poster wanted to do the > parsing (that is, the reading and interpretation of bytes/chars) on > behalf of expat. I've got a hunch that what he really wanted to do was "pull" the higher-level somethings rather than have them "pushed" at him, i.e. call a function to get the next something rather than have the parser make a callback, presumably because he needs to maintain some state and he'd like to do it via flow-of-control rather than setting and testing state variables. If that's the case, he'd be better off building a wrapper that would feed the input incrementally to expat and buffer up events, with the whole thing driven by a "get next token" function that would return something similar to a line of ESIS data. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Tue Dec 7 18:56:42 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? References: Message-ID: <384D58A7.2AA6493F@fxtech.com> > I've got a hunch that what he really wanted to do was "pull" the > higher-level somethings rather than have them "pushed" at him, i.e. call a > function to get the next something rather than have the parser make a > callback, presumably because he needs to maintain some state and he'd like > to do it via flow-of-control rather than setting and testing state > variables. No, I did want things pushed at me (via callbacks), but I want the opportunity to do some object-specific processing "inside" one of the callbacks, after the next set of *nested* elements were processed. This requires a nestable parser, where I can pick up the parsing inside a different scope. One of the examples I presented yesterday sort of illustrates what I want to accomplish. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhanson at blast.net Tue Dec 7 19:03:24 1999 From: rhanson at blast.net (Robert Hanson) Date: Mon Jun 7 17:18:26 2004 Subject: A processing instruction for robots References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com><3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com> Message-ID: <013b01bf40e4$c25d7890$0cb919ce@INTERNETDEPT> > I'm open to changing this, but I thought I would start > with the most strict version. The advantage of the strict > version is that it doesn't need to be parsed. > The Desparate Perl Hacker can do four regex compares > for the four variants and get back to work. I guess that depends on how "desparate" they are. It is relatively easy to do no matter what order the attributes are. I would suggest not specifying an order unless you can think up a better reason for keeping it. Below is sample code with the output... (notice the 8 examples with varying attribute orders an values... but only 2 regexes). ---------------------------------- # Tested Perl code my @examples = ( '', '', '', '', '', '', '', '' ); foreach my $ex1 ( @examples ) { if ( $ex1 =~ /<\?robots((?:\s+(?:index|follow)="(?:yes|no)"){2})\s*\?>/ ) { my %tmp; while ( $ex1 =~ /\s+(index|follow)="(yes|no)"/g ) { $tmp{$1} = $2; } print "Follow: $tmp{follow} Index: $tmp{index}\n"; } } ==== OUTPUT ==== Follow: yes Index: yes Follow: yes Index: no Follow: no Index: yes Follow: no Index: no Follow: yes Index: yes Follow: no Index: yes Follow: yes Index: no Follow: no Index: no =============== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Dec 7 20:42:31 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:18:26 2004 Subject: Request for Discussion: SAX 1.0 in C++ Message-ID: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com> Here's my take on it... Note that what I'm saying here reflects the necessities of supporting really bad C++ implementations, not my personal feelings. If it were up to me, I'd say use every modern service of C++ and those who don't have compliant C++ implementation can have a good reason to get one. But, by an unfortunate decision, I was not made the ruler of the world... Go figure! 1) I don't mind that we just start of with SAX2 I guess. It makes sense this late in the game perhaps to just concentrate on SAX2. 2) We would prefer that all data come out of the SAX interfaces as raw wchar_t strings. This is the most flexible mechanism and does not lock people into using any particular implementation of a string object. It also has the highest potential performance for those folks who never need to put it into anything more formal than a raw array. 3) We agree with the basic desire to avoid object ownership issues, but wouldn't worry about them if they are well documented. Object ownership is just a fundamental issue in C++ and if you don't understand them you probably are going to blow your own foot off no matter what. 4) We would be concerned about some of the SAX2 stuff wrt setting features (I think its features) via an abstracted object interface because its a little bit sticky. It can be done, but the point still arises of where does the desirability of being the same as the Java interface end and the desireability of having a very natural interface for your own language begin? I.e. just don't make it so Java'esque that it requires a lot of trickery to make work on C++. Don't require some common base class. 5) If you wanted to templatize the interface over the character type, we wouldn't mind particularly. But, considering that any implementation of the interface would *always* use the same instantiation, why bother? Just typedef the character type and let each implementation drive it. Its not likely that a particular build of a particular implementation would need to change this on the fly, right? 6) The issue of handler ownership is something we punted on. As far as we are concerned, handlers installed on the SAXParser belong to the caller because in most cases one object implements a number of handlers. 7) The names of methods of the handlers need to be non-ambiguous to avoid problems. So DocType handlers should use DocTypeCharacters() or DTDCharacters() or whatever, and Document handlers should use DocCharacters() or some such thing. Its just not worth the paranoia of how implementations would deal with multiple mixed in interfaces having the same named methods. If the processing should be common, the class implementing both handlers can delegate to a private method. 8) I disagree with the contention that unsigned shouldn't be used in interfaces. If the thing being modeled is unsigned, use unsigned because you are modelling the type desired. I would personally typedef (by logical usage) all of the fundamental types used by the interfaces and let the implementation drive them. 9) APIs such as getType() or getValue() should return a "const wchar_t*" so that the caller uses the returned value directly. The overhead of copying the return (and having to clean it up) would probably be unacceptable (actually it wchar_t would be some defined type that is driven by the implementation.) Yes this involves ownership issues, but as I said, this is fundamental to C++, so people should probably just 'get over it' :-) 10) I believe that its better to have the interfaces remain pure virtual and provide a HandlerBase. This lets people who want to be sure that they've overridden everything be told so by the compiler, and it allows selective overriding by using HandlerBase where desired. 11) The class names (since we can't afford to use C++ namespaces) should be expanded to include a SAX prefix to avoid clashes. So SAXParser and SAXLocator and SAXAttributeList and so on. 12) We added reset() methods to all the handlers. The reason being that, on the start of a new parse operation, each handler might need to reset its internal state. We assume that the handlers might be completely unknown to the code that kicks off the parse event and we didn't want them to have to assume that the order of events wouldn't change over time (i.e. we didn't want them to just pick what they think will be the first event and reset from that.) That's all I can think of at the moment. I haven't had enough time to look at SAX2 closely so I don't know what there might be problematic to us in the C++ world. But, I still think that its good enough to just pick up at SAX2 as long as SAX2 can be reconcilled with the needs of the C++ world. ---------------------------------------- Dean Roddey Software Weenie IBM Center for Java Technology - Silicon Valley roddey@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Tue Dec 7 20:45:29 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? References: <384D3843.F3F7AFB9@fxtech.com> <82jqbu$pns$1@eve.enteract.com> Message-ID: <384D7222.23FAAF41@fxtech.com> > Ignore my response on xml-dev: I incorrectly guessed what you wanted to > do. Let me try again. This time it looks like what you want to do is > process Foo and its contents with a different set of handlers than the > rest of the document. If that's the case, have your "standard" > StartElement handler set new handlers when it encounters a Foo and have > the new EndElement handler set the handlers back to the "standard" ones > when it encounters the end of a Foo. If necessary, maintain a stack of > pointers to handlers. This is a good idea on the surface, and where I started down in my implementation when I hit a snag. This provides too much housekeeping, and too many functions if you want to do something special when the element is finished (such as add the just-parsed object to a list). It would be nicer to be able treat parsing of an element as an atomic operation, so you can write code like this: Document::ParseDocument(XML_Input &in) { XML_ElementHandler handlers[] = { { "Object", ParseObject } { NULL } }; in.Parse(handlers, this); } Docuement::ParseObject(XML_Element &element, void *userData) { Document *doc = (Document *)userData; Object *obj = new Object; obj->Parse(element); doc->AddObject(obj); } Object::Parse(XML_Element &element) { XML_ElementHandler handlers[] = { ... object-specific element handlers ... }; // parse just the object subtree to the token element.Parse(handlers, this); } You see in ParseObject() that I can do everything I need to create a new object, parse it, and do something with it after I've parsed it. I can only do this if the parser lets me parse just a subtree and then stop (ie. it returns control back to me when it finds the token). -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Tue Dec 7 20:51:42 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? In-Reply-To: <384D58A7.2AA6493F@fxtech.com> Message-ID: <000501bf40f4$f6348c20$099918d1@docuverse1> There is clearly a need for this although IMHO the demand for it will not be large until complexity of XML data grows significantly. Event-based API, like SAX, is reactive opposed to active APIs like Java's StringTokenizer. I found that reactive systems dealing with complex data/event stream tend to get bogged down with state management which increases maintenance cost significantly. Extensibility of XML will works against you unless you know what you are doing. Active/reactive designs, reactive at high level and active at low level, are more suited to handling complex XML data. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Dec 7 22:49:39 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? In-Reply-To: Lars Marius Garshol's message of 07 Dec 1999 18:53:32 +0100 Message-ID: <7390.199912072249@doyle.cogsci.ed.ac.uk> > | I see a demand for parsing a document with SAX, but using some > | start-tags to switch to building DOM (or DOM-like) objects, returning > | to stream-oriented processing afterwards. As Lars pointed out, it seems like the original poster wanted something else. But if this *is* what you want, LT XML provides it - after reading a start tag you can call a function that "fills in" the tree starting there. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rossb at wrq.com Tue Dec 7 23:08:38 1999 From: rossb at wrq.com (Ross Bleakney) Date: Mon Jun 7 17:18:26 2004 Subject: Appending to an XML document Message-ID: <1654BC972546D31189DA00508B318AC82CB832@charmander.wrq.com> My apologies if you have already read this in the XSL list. I (and apparently several other people) have a need to append an element onto an existing XML file. I would like to avoid reading in the whole document and then writing it back out again. My original plan was to open the file, find the end of it, back up a bit to find the last tag, write the new element and then rewrite the closing tag. I am looking for a generic solution. I know of no XML API that allows for modifying a document. They make it easy to create new documents out of old ones, but they don't allow you to modify an existing file. Doing so would mean the possibility of optimization that would greatly reduce disk I/O. For example, if you had XML like this: ... ... It would be really nice to write code like this: ModifyXML modXML = new ModifyXML("MyDoc.XML"); Element event = modXML.createElement("Event"); event.appendChild(modXML.createTextNode("A big event happened")); modXML.appendChild("Events", event); modXML.update(); An implementor of this interface could take advantage of the fact that is the main tag and perform the same sort of work I suggested (backing up from the end and then writing). The routines for this interface would be very limited since this would only be used when you want to modify a document and you know that using SAX (or DOM) is inefficient. Thus there would be no reason to have an "insertBefore". The API could be limited to appending and deleting. Is there something like this already? Thanks, Ross xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anupama at quintessent.net Tue Dec 7 23:10:36 1999 From: anupama at quintessent.net (anupama@quintessent.net) Date: Mon Jun 7 17:18:26 2004 Subject: GUI XML doc authoring tools Message-ID: There was a listing on XML-Industry news(http://www.oasis-open.org/cover/xmlNews.html) about alpha release of an editor. http://architag.com/xray/ I haven't tried it, but it might be close to what you are looking for. "Jeff Russell" @ic.ac.uk on 12/06/99 12:26:48 PM Please respond to "Jeff Russell" Sent by: owner-xml-dev@ic.ac.uk To: "Xml-Dev@Ic. Ac. Uk" cc: Subject: GUI XML doc authoring tools Anybody know of any Windows GUI (or Linux, as a last resort) XML document authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a DTD. Jeff Russell jefftr@bellsouth.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Tue Dec 7 23:38:20 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:18:26 2004 Subject: Appending to an XML document In-Reply-To: <1654BC972546D31189DA00508B318AC82CB832@charmander.wrq.com> Message-ID: <199912072337.AAA10151@sonne.darmstadt.gmd.de> Ross, currently I'm busy designing an XML based log format myself. In contrast to "classic line based logging", appending indeed is prohibitively costly in XML. Thus I decided not to log into a wellformed XML document, but to stick with a sequence of type doc-fragments, just being well-formed per event. Of course one can not parse the result immediately, but at the time of log analysis (or whatever you do with your event data), it's trivial to pre- and append the necessary tags to enclose the doc- fragments. XML was just not designed to fit the demands of concatenatiation. But I found the value of structuring single events in a "semi-structured" (read: well-formed) way valuable enough to choose XML. The "missing enclosing tag" is not really a serious problem if you delay its insertation until REALLY necessary. ++im Ross Bleakney wrote at 7 Dec 99, 15:08: > I know of no XML API that allows for modifying a document. They make it easy > to create new documents out of old ones, but they don't allow you to modify > an existing file. Doing so would mean the possibility of optimization that > would greatly reduce disk I/O. For example, if you had XML like this: > > > ... > ... > > > It would be really nice to write code like this: > > ModifyXML modXML = new ModifyXML("MyDoc.XML"); > Element event = modXML.createElement("Event"); > event.appendChild(modXML.createTextNode("A big event happened")); > modXML.appendChild("Events", event); > modXML.update(); > -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From beth at planet7tech.com Tue Dec 7 23:46:09 1999 From: beth at planet7tech.com (Beth Penland) Date: Mon Jun 7 17:18:26 2004 Subject: ANNOUNCE: XML Advisory Council Message-ID: Planet 7 Technologies is looking for experienced XML developers to participate in our P7 Advisory Council. We are currently developing software to fundamentally improve the way eCommerce networks use XML information, allowing for the real-time distribution of XML across existing networks and applications. Please contact beth@planet7tech.com if you are interested. Beth Penland Planet 7 Technologies 2787 152nd Avenue NE Building 7 Redmond, WA 98052 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Dec 7 20:42:31 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:18:26 2004 Subject: Request for Discussion: SAX 1.0 in C++ Message-ID: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com> Here's my take on it... Note that what I'm saying here reflects the necessities of supporting really bad C++ implementations, not my personal feelings. If it were up to me, I'd say use every modern service of C++ and those who don't have compliant C++ implementation can have a good reason to get one. But, by an unfortunate decision, I was not made the ruler of the world... Go figure! 1) I don't mind that we just start of with SAX2 I guess. It makes sense this late in the game perhaps to just concentrate on SAX2. 2) We would prefer that all data come out of the SAX interfaces as raw wchar_t strings. This is the most flexible mechanism and does not lock people into using any particular implementation of a string object. It also has the highest potential performance for those folks who never need to put it into anything more formal than a raw array. 3) We agree with the basic desire to avoid object ownership issues, but wouldn't worry about them if they are well documented. Object ownership is just a fundamental issue in C++ and if you don't understand them you probably are going to blow your own foot off no matter what. 4) We would be concerned about some of the SAX2 stuff wrt setting features (I think its features) via an abstracted object interface because its a little bit sticky. It can be done, but the point still arises of where does the desirability of being the same as the Java interface end and the desireability of having a very natural interface for your own language begin? I.e. just don't make it so Java'esque that it requires a lot of trickery to make work on C++. Don't require some common base class. 5) If you wanted to templatize the interface over the character type, we wouldn't mind particularly. But, considering that any implementation of the interface would *always* use the same instantiation, why bother? Just typedef the character type and let each implementation drive it. Its not likely that a particular build of a particular implementation would need to change this on the fly, right? 6) The issue of handler ownership is something we punted on. As far as we are concerned, handlers installed on the SAXParser belong to the caller because in most cases one object implements a number of handlers. 7) The names of methods of the handlers need to be non-ambiguous to avoid problems. So DocType handlers should use DocTypeCharacters() or DTDCharacters() or whatever, and Document handlers should use DocCharacters() or some such thing. Its just not worth the paranoia of how implementations would deal with multiple mixed in interfaces having the same named methods. If the processing should be common, the class implementing both handlers can delegate to a private method. 8) I disagree with the contention that unsigned shouldn't be used in interfaces. If the thing being modeled is unsigned, use unsigned because you are modelling the type desired. I would personally typedef (by logical usage) all of the fundamental types used by the interfaces and let the implementation drive them. 9) APIs such as getType() or getValue() should return a "const wchar_t*" so that the caller uses the returned value directly. The overhead of copying the return (and having to clean it up) would probably be unacceptable (actually it wchar_t would be some defined type that is driven by the implementation.) Yes this involves ownership issues, but as I said, this is fundamental to C++, so people should probably just 'get over it' :-) 10) I believe that its better to have the interfaces remain pure virtual and provide a HandlerBase. This lets people who want to be sure that they've overridden everything be told so by the compiler, and it allows selective overriding by using HandlerBase where desired. 11) The class names (since we can't afford to use C++ namespaces) should be expanded to include a SAX prefix to avoid clashes. So SAXParser and SAXLocator and SAXAttributeList and so on. 12) We added reset() methods to all the handlers. The reason being that, on the start of a new parse operation, each handler might need to reset its internal state. We assume that the handlers might be completely unknown to the code that kicks off the parse event and we didn't want them to have to assume that the order of events wouldn't change over time (i.e. we didn't want them to just pick what they think will be the first event and reset from that.) That's all I can think of at the moment. I haven't had enough time to look at SAX2 closely so I don't know what there might be problematic to us in the C++ world. But, I still think that its good enough to just pick up at SAX2 as long as SAX2 can be reconcilled with the needs of the C++ world. ---------------------------------------- Dean Roddey Software Weenie IBM Center for Java Technology - Silicon Valley roddey@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Tue Dec 7 20:51:42 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? In-Reply-To: <384D58A7.2AA6493F@fxtech.com> Message-ID: <000501bf40f4$f6348c20$099918d1@docuverse1> There is clearly a need for this although IMHO the demand for it will not be large until complexity of XML data grows significantly. Event-based API, like SAX, is reactive opposed to active APIs like Java's StringTokenizer. I found that reactive systems dealing with complex data/event stream tend to get bogged down with state management which increases maintenance cost significantly. Extensibility of XML will works against you unless you know what you are doing. Active/reactive designs, reactive at high level and active at low level, are more suited to handling complex XML data. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Wed Dec 8 00:26:06 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:26 2004 Subject: Appending to an XML document In-Reply-To: <199912072337.AAA10151@sonne.darmstadt.gmd.de> Message-ID: <000101bf4112$e677f060$099918d1@docuverse1> I ran into this as well when working on XLF (eXtensible Log Format). At that time, I was not into SML, so the solution was to store entries in an external parsed entity and have a wrapper XML document that just defined the entity inside the document element. Looking back at what I did with a new perspective/attitude, namely SML, I now have a different solution. All you need to do is redefine your idea of an XML document and refrain from using certain features of XML. If you do not use any part of the 'prolog' and 'Misc' production rules in the XML 1.0 spec, and if you detach the notion of an XML document being a file, you can send or store multiple XML documents in a single stream or file. Appending is now just a matter of appending an XML document to the end of a file or a stream. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Wed Dec 8 00:37:46 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:26 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies Message-ID: <384DA88A.BFC736C3@fxtech.com> Thanks for all who have given feedback on my desires for a relatively atypical parsing idiom for XML. Some of my interest is based on a proprietary parser I wrote a few years ago, that I've used for everything since. It's tag-based and object-oriented, and each block of a document can be parsed as a complete unit. When used to parse object-oriented data, it lets each object easily handle its own parsing. Now I'd like to apply the same concepts to an XML parser, used primarily when object-oriented program data is stored as XML syntax. I believe the best way to describe what I want to do (and why) is to show a concrete example. Suppose I have a program that generates images composed of layers with multiple objects in each layer. Each layer has a size associated with it as well. The classes I have are: Document (contains one or more layers) Layer (contains one or more objects and a Size) Object (some type of object) Size (an object which represents a width and height) Point (x,y value) Rect (x1,y1 to x2,y2) Circle (type/subclass of Object) Square (type/subclass of Object) Ideally, each object would be able to write out its data in XML form, and parse its own data (along with a list of attributes if it uses them). Here is an example xml file: 640x480
320,240
25.0
10,10-40,40
If you think about the object hierarchy associated with this document, you have something like this: Document contains Layer ("background") contains Size (640x480) contains Circle (Object) Contains Point (320,240) Contains float (25) contains Square (Object) Contains Rect (10,10 - 40,40) I tend to design APIs from the point of view of the programmer. Since as the number of classes in my application grows, I want to minimize the amount of extra code I have to write. So I'd like to simplify the parsing down to the minimum amount of necessary boilerplate code. So let's assume that each object has its own Parse() method. This method gets called with an XML::Element object which has the name and attributes for that object. Parsing of the entire object should be an atomic operation. I use static function pointers as callbacks to avoid having to subclass from any XML-specific classes. User-data is passed along in the parsing so we can cast it back to the necessary type in one of the element handlers. The code is presented in C++ but the parsing operations can easily have a "C" interface. Exceptions are thrown if anything goes wrong, so there are no error codes. Here is the code needed to open the XML file and find the top-level XML element: Document *App::LoadDocument(const char *path) { // specify a handler to look for "Document" elements XML::ElementHandler handlers[] = { XML::ElementHandler("Document", sParseDocument) XML::ElementHandler::END }; XML::Input file(path); file.Parse(handlers, this); } >From here on out each object is responsible for parsing itself, based on an XML::Element object that is passed to it. Please examine the code closely to see the indended design and flow. // when a Document element is found, it is passed to the sParseDocument handler void App::sParseDocument(const XML::Element &elem, void *userData) { // userData is the App * from the file.Parse() call above App *app = (App *)userData; // we found a document element, so make one using the attributes Document *doc = new Document(elem.GetAttribute("name")); // now parse the document doc->Parse(elem); // if we get here without a thrown exception, the Document parsed // okay and we can add it app->AddDocument(doc); } void Document::Parse(const XML::Element &elem) { // specify handlers to look for "Layer" elements XML::ElementHandler handlers[] = { XML::ElementHandler("Layer", sParseLayer) XML::ElementHandler::END }; elem.Parse(handlers, this); // if we needed to do something special, like validating the // document, we could do it right here } void Document::sParseLayer(const XML::Element &elem, void *userData) { // again, userData is the Document * passed in elem.Parse() above Document *doc = (Document *)userData; // make a new layer Layer *layer = new Layer(elem.GetAttribute("name")); // parse the layer layer->Parse(elem); doc->AddLayer(layer); } void Layer::Parse(const XML::Element &elem) { // specify handlers to look for "Size" and "Object" elements // note that for the Size element we call the Size object's static // parse function directly, and we're passing the address of our // contained Size member as its user-data, so we do not need to // provide an additional static Size handler to forward to the Size // object's member Parse() method XML::ElementHandler handlers[] = { XML::ElementHandler("Size", Size::sParse, &mSize) XML::ElementHandler("Object", sParseObject) XML::ElementHandler::END }; elem.Parse(handlers, this); } void Size::sParse(const XML::Element &elem, void *userData) { Size *size = (Size *)userData; // size has no attributes, just data, so read it directly // note that elem.ReadData() reads character data up to the // ending element tag and returns the size found char tmp[40]; size_t len = elem.ReadData(tmp, sizeof(tmp)); tmp[len] = '\0'; sscanf(tmp, "%dx%d", &size->width, &size->height); } void Layer::sParseObject(const XML::Element &elem, void *userData) { // again, userData is the Layer * passed in elem.Parse() above Layer *layer = (Layer *)userData; // make a new object from the object type std::string type = elem.GetAttribute("type"); // I would normally use a factory here but this illustrates the // point better Object *obj = NULL; if (type == "circle") obj = new Circle(); else if (type == "square") obj = new Square(); // now let the object (whatever type it is) parse itself obj->Parse(elem); layer->AddObject(obj); } So I hope this gets the idea across. I'd be interested in feedback. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Wed Dec 8 00:52:15 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:18:26 2004 Subject: SGML the next big thing? Message-ID: <87256841.0004BF66.00@d53mta03h.boulder.ibm.com> >This & thing is so far outside the way most other computer languages >work that standard off-the-shelf parser generators roll on their >backs and wave their paws in the air and admit defeat. Personally, I think & should be limited to just 'leaf' nodes only. This would keep it sane, and allow it to be implemented as a special case content model (as we already do for Mixed anyway.) This would provide a lot of usefulness without forcing everyone to throw out the very fast and compact DFA type representations in wide use now, or implement another (much more complex one) in addition to. People should accept the fact that XML is not going to solve all problems and still stay light enough to remain what it was intended to be. Schema will already be very bloated with all the other stuff in it now. I guesstimate that a Schema implementation will probably be at least twice the size of the existing parser code in most implementations, if not more so. Anyone think differently? ---------------------------------------- Dean Roddey Software Weenie IBM Center for Java Technology - Silicon Valley roddey@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Dec 8 01:27:31 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:18:26 2004 Subject: A question on nomenclature References: <805C62F55FFAD1118D0800805FBB428D02BC017D@cc20exch2.mobility.com> Message-ID: <00fb01bf411b$0db5bd80$0300000a@cygnus.uwa.edu.au> > > > > > > > (James Tauber described this as a "schema-by-example") but what I really > want is the name that I would call that "class" of XML documents. In linguistic terms, you have a "grammar" defining a "language" which is really just a set of "utterances". In XML, a "grammar" is generally called a "schema" and an utterance is called an "instance". So what you are asking, if I understand correctly, is what is the term corresponding to "language". The term most consistent with the XML 1.0 REC would probably be "document type". So you would say you have a "schema" defining a "document type" which is really just a set of "instances". > but I don't care about that. I'm still leaning toward "vocabulary", because > that still seems to describe it best, but I'm still open too. (I think > "schema" is probably correct for what I'm trying to do as well, but that > would confuse readers with XML Schemas, which are just one type of "schema > description language"...) 1. Yes, people get confused between a schema and a schema language and use "schema" to mean both. 2. There is a distinction between a schema and the set of valid documents for that schema (ie a "document type"). It is the distinction between a grammar and the language it defines. So you could use the term "schema" for the *definition* of the set of valid documents (whether its a DTD, a W3C XML Schema or a schema-by-example), but the actual set of valid documents is best called something else (like "document type"). Hope this helps James Tauber xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From steve at rsv.ricoh.com Wed Dec 8 01:31:27 1999 From: steve at rsv.ricoh.com (Stephen R. Savitzky) Date: Mon Jun 7 17:18:26 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies In-Reply-To: Paul Miller's message of Tue, 07 Dec 1999 19:38:34 -0500 References: <384DA88A.BFC736C3@fxtech.com> Message-ID: This is basically a traditional top-down, recursive-descent parser. Unfortunately, it's completely different from the way most XML parsers I've seen work, although I believe there's a lexical layer underneath expat that can be made to work this way. But there's better way of looking at the situation, namely that what you really want to do is make a top-down traversal of the document's parse tree. In other words, at any given position in the tree, you want to do pseudocode like // process a element. Foo::process(const XML::Element &elem) { // do the setup for (XML::Node *node = elem.getFirstChild(); node != null; node = node->getNextSibling()) { processChild(node); // dispatch on node's type & tag } // do the cleanup } This works as-is if the result of your parse is a DOM tree or some equivalent parse-tree representation of the document, but trees take memory. So the next step is to use a parser that looks like a tree traverser: // process a element. Foo::process(TreeTraverser &it) { // do the setup, using it.getAttrList(), etc. on the current node if (it.hasChildren()) { for (it->toFirstChild(); !it.atEnd(); it.toNextSibling()) { processChild(it); // dispatch on new current node's type & tag } it.toParent(); // go back up the tree } // do the cleanup } Note that if your parser has this interface, you may never have to actually build the whole tree. Similarly, you can output to a ``tree constructor'' that merely appends characters to a string. We've built a document-processing system (currently in Java) using this kind of interface; you can find it at . -- Stephen R. Savitzky Platform for Information Applications: Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center voice: 650.496.5710 front desk: 650.496.5700 fax: 650.854.8740 home: URL: http://theStarport.org/people/steve/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata.makoto at fujixerox.co.jp Wed Dec 8 04:55:05 1999 From: murata.makoto at fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:18:26 2004 Subject: SML and I18N Message-ID: <199912080457.AA03578@archlute.fujixerox.co.jp> Rick Jelliffe wrote: >Where I am, here in Taiwan, the main question people ask is "how do I >represent a document in Big5 in XML?". So moving to ASCII or even to >only UTF-8 will make SML into a US-only or Western-only language. The >simplifications proposed so far seem a gigantic step backwards away from >a "World Wide Web" and back 20 (or even 5?) years to a world where rich >white countries developed technology which created a technological poverty >in non-Western countries. I am totally against weakening I18N of XML 1.0. Even 1% trim down is absolutely completely unacceptable to me. Legacy encodings, natural language markup, the xml:lang attribute, encoding declarations, the charset parameter, and numeric character references must be preserved. If SML omits any of them, SML is not for the World Wide Web. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata.makoto@fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Wed Dec 8 08:12:51 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:26 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: roddey@us.ibm.com's message of "Tue, 7 Dec 1999 13:40:43 -0700" References: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com> Message-ID: >>>>> roddey@us.ibm.com: This statement: > ... If it were up to me, I'd say use every modern service of C++ and > those who don't have compliant C++ implementation can have a good reason to > get one. [...] conflicts with this statement: > 2) We would prefer that all data come out of the SAX interfaces as > raw wchar_t strings. This is the most flexible mechanism and does > not lock people into using any particular implementation of a string > object. It also has the highest potential performance for those > folks who never need to put it into anything more formal than a raw > array. std::basic_string<> _is_ a modern service of C++, and a pretty good one from an API point of view. Personally I say: use std::basic_string<> and death to all other string representations in C++. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mikew at o3.co.uk Wed Dec 8 09:37:41 1999 From: mikew at o3.co.uk (Mike Williams) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? In-Reply-To: Paul Miller's message of "Tue, 07 Dec 1999 13:57:43 -0500" References: <384D58A7.2AA6493F@fxtech.com> Message-ID: >>> On Tue, 07 Dec 1999 13:57:43 -0500, >>> "Paul" == Paul Miller wrote: Paul> No, I did want things pushed at me (via callbacks), but I want the Paul> opportunity to do some object-specific processing "inside" one of the Paul> callbacks, after the next set of *nested* elements were processed. This Paul> requires a nestable parser, where I can pick up the parsing inside a Paul> different scope. What about using nestable *handlers*. Say you're parsing something like this: ... xxx ... When your main handler sees the tag, create a new "FooHandler" object. Your main handler would then need to delegate all events to the FooHandler, until the corresponding is seen. Not that this is particularly easy to implement. In fact, I started to implement something similar (in Java), but got fed up ... I've reverted to using a DOM as input, for the time being. The main complication is that the delegating handler has to maintain a context-stack while it's delegating, in order to match the correct end-tag. One way around this might be to get the FooHandler to notify the main handler when it's finished. Just an idea ... -- Mike Williams xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nicmila at vscht.cz Wed Dec 8 10:33:06 1999 From: nicmila at vscht.cz (Miloslav Nic) Date: Mon Jun 7 17:18:26 2004 Subject: New tutorial at Zvon Message-ID: <384E33BA.23FB283C@vscht.cz> WML (Wireless Markup Language) - language for mobile devices (WAP) is getting recently some attention. This language is based on XML. At Zvon you will find a new tutorial: http://zvon.vscht.cz/HTMLonly/WMLTutorial/Examples/Example1/index.html which demonstrates some features of this language on several examples. The tutorial contains a simple emulator of PDA device (in HTML, so do not worry, there is nothing to download apart from actual pages.). -- *************************************************************** Dr. Miloslav Nic e-mail: nicmila@vscht.cz Department of Organic Chemistry TEL: +420 2 2435 5012 ICT Prague (VSCHT Praha) +420 2 2435 4118 FAX: +420 2 2435 4288 **************************************************************** Support free information exchange: http://zvon.vscht.cz **************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Andy.Bradbury at syntegra.bt.co.uk Wed Dec 8 11:00:45 1999 From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk) Date: Mon Jun 7 17:18:26 2004 Subject: Check this Message-ID: <65AF45D5E535D2118AFB0008C7FA2318035A9AFF@FL-EXCHANGE-03> Warning The following e-mail was received on the 'junior' XML list. The attachment - LINKS.VBS (11K) - contained a VBScript virus: VBS_FREELINK. Regards Andy B. -----Original Message----- From: Conrad Meier [mailto:conradm@SOFTWAREFUTURES.COM] Sent: 08 December 1999 09:00 To: XML-L@LISTSERV.HEANET.IE Subject: Check this Have fun with these links. Bye. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From d.d.barnes at ic.ac.uk Wed Dec 8 11:06:33 1999 From: d.d.barnes at ic.ac.uk (ic\ddsb/d.d.barnes) Date: Mon Jun 7 17:18:26 2004 Subject: using xt in a browser Message-ID: <384E3B7A.1D62697F@ic.ac.uk> Hello, I am sorry to ask this (again - long story), but can anyone tell me/ show me an example of how you can use xt to transform xml into html from within a piece of javascript/java in a page? I apologise for my painful ignorance . . . and thanks in advance, David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Wed Dec 8 13:24:23 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:26 2004 Subject: nestable C/C++ XML parser? References: <384D3843.F3F7AFB9@fxtech.com> Message-ID: <004a01bf4180$2b6ac4a0$c22a08d1@tomshp> ----- Original Message ----- From: Paul Miller > I'm trying to develop a tag-based front-end to expat and having no luck. > I'd like to be able to parse an XML document in nestable chunks, by > calling into a nestable parser. In other words, I'd like to start > parsing, then branch to a function to handle a specific element, parsing > in there until that element is closed, then fall back out of the > function to continue parsing the rest of the document. > I take it that you want to be able to ignore part of the doument, and only process the pieces you are interested in. Is that right? Then each piece would be valid XML if it were enclosed in a root element. You don't need to literally do what you have suggested. That is, "parse in there...". You do need to parse handle the elements of different pieces differently. Three approaches come to mind. 1) Preprocess to extract just the pieces you want, wrap them in root elements so they are complete documents, then run expat (or whatever) separately on them using SAX. The preprocess should be fast and easy, and perhaps could be done using regular expressions, or SAX. Alternatively, if the xml is relatively simple, don't wrap the fragments, and process them using regular espressions insstead. (Search this archives of this group for the last few months to find a reference to "shallow parsing using regular expressions"). 2) You really are talking about a state machine, I think. That is, if you have reached the right piece of the document, you go to a different manner of handling the elements (they will still parse the same, it's just the handling that would be different). So you could explicitly maintain a state variable and have the SAX (or whatever) callbacks behave differently according to the state. This would be conceptually simple but might be a pain to implement depending on how many different element handlers you will use. 3) Again as a state machine, you could use a function pointer to specify the callbacks, and when you change state you change the function pointers to point to different handlers. I don't know whether you would have to modify expat to do this or not, but changes should be minor if needed. Regards, Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tpassin at idsonline.com Wed Dec 8 13:47:34 1999 From: tpassin at idsonline.com (Thomas B. Passin) Date: Mon Jun 7 17:18:27 2004 Subject: nestable C/C++ XML parser? References: <384D3843.F3F7AFB9@fxtech.com> <82jqbu$pns$1@eve.enteract.com> <384D7222.23FAAF41@fxtech.com> Message-ID: <007001bf4183$6388e3a0$c22a08d1@tomshp> ----- Original Message ----- From: Paul Miller > ...It [...] would be nicer to be able treat parsing of an element as an atomic > operation, so you can write code like this: > > Document::ParseDocument(XML_Input &in) > { > XML_ElementHandler handlers[] = { > { "Object", ParseObject } > { NULL } > }; > in.Parse(handlers, this); > } > > Docuement::ParseObject(XML_Element &element, void *userData) > { > Document *doc = (Document *)userData; > Object *obj = new Object; > obj->Parse(element); > doc->AddObject(obj); > } > ... > You see in ParseObject() that I can do everything I need to create a new > object, parse it, and do something with it after I've parsed it. I can > only do this if the parser lets me parse just a subtree and then stop > (ie. it returns control back to me when it finds the token). > > -- > Paul Miller - stele@fxtech.com You can see the difficulty - if you send a fragment to a parser it's not a valid xml document (so the parser can't work with it). You could start building a subtree when you get to the point of interest, using DOM calls, but you keep saying you don't want to deal with DOM. Where you are being unclear is when you say "parse just a subtree". It is unclear whether you think you need to get (or build) an actual tree structure, or whether the expression is just a shorthand for indicating a place in the document. It is also unclear when you say that, because how do you know that you are at the right starting place in the document? I assume that you have been parsing from the start of the document to get to the point of interest. Then you say you want to start parsing at that point. See why it's confusing? If you just want to know the names of the elements in the fragment, just keep a state variable. I know you said it's too much machinery, but maybe there is a way it wouldn't be. Alternatively, there are other tree builders that are simpler than DOM. Look at Sean McGrath's xml tree code in (I think) "XML By Example) for one example. Of course it depends on the complexity of what you are doing. All in all, I still think that a preprocessing pass to extract the fragments you want to look at, as I mentioned in my previous post, is the way to go. Tom Passin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Wed Dec 8 14:19:26 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:27 2004 Subject: nestable C/C++ XML parser? In-Reply-To: Paul Miller's message of "Tue, 07 Dec 1999 13:20:23 -0500" References: <384D3843.F3F7AFB9@fxtech.com> <384D4FE7.C81F2CCE@fxtech.com> Message-ID: >>>>> Paul Miller : > ... I want to use XML as an application data file format. Why? Two > primary reasons: > 1. I don't need/want to invent a new syntax - I like XML just fine and > it handles object-oriented nesting of data quite nicely > 2. I can publish a DTD and make it easier for my end-users to use my > application data in their own applications [snip!] I have a similar situation, but went for a very different solution: 1. wrapped a SAXoid interface around expat 2. wrote a callback class with virtual functions for all elements in the DTD 3. wrote a DocumentHandler that contains a pointer to an instance of the callback class, and a table of tag names and pointers to member functions of the callback class. This class also does some rudimentary element content checking, but this will be dropped when a validating parser is available Then I have two implementations of the callback class: - a simple one for debugging of the expat/sax chain, that just prints out what it receives - a complicated one that unpacks attributes, keeps context between SAX events on a stack, and builds data structures in the system The gain here is that since I'm relying on SAX (and plan to track the standard that David Megginson and James Clark et al. settle on) I will in the future have a choice of parsers, and can use one that supports namespaces and/or validation. It also lets me have the same basic infrastructure for all XML based formats (I currently have two: our native format and SVG). The biggest and clumsiest code here, is the recognition and decoding of element attributes in the callback class. Good guidelines for efficiency and simplicity are highly desired. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jesmith at kaon.com Wed Dec 8 14:38:04 1999 From: jesmith at kaon.com (Joshua E. Smith) Date: Mon Jun 7 17:18:27 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies In-Reply-To: <384DA88A.BFC736C3@fxtech.com> Message-ID: <3.0.1.32.19991208093643.00697d64@tiac.net> >Now I'd like to apply the same concepts to an XML parser, used primarily >when object-oriented program data is stored as XML syntax. Is holding the document in memory not an option? Because if you can hold it all in memory, it's simple enough to build a tree representation of the thing, then walk the tree with your handlers, then throw away the tree. If that isn't an option, then I agree that you have a combination of: 1) A nice interface paradigm (you could even generate a stub of the program from a DTD!); and, 2) Quite a challenge getting an event-based parser to work with it because of control-flow issues. Here's a nutty idea -- Try threads. Run the parser and expat as two separate threads and cross-synchronize them. Each expat handler would signal the parser thread to go (and then block until it hears back), and the ::Parse method in the parser thread would signal the expat handler to continue (and then block until the end handler signals it back). The two threads never actually overlap, but you get two processing stacks to handle the control flow issues. -Joshua Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From obecker at informatik.hu-berlin.de Wed Dec 8 14:50:50 1999 From: obecker at informatik.hu-berlin.de (Oliver Becker) Date: Mon Jun 7 17:18:27 2004 Subject: VoiceXML Message-ID: <199912081450.PAA02482@mail.informatik.hu-berlin.de> Hi there, I'm exploring the potential of VoiceXML [1] for the development of new voice applications. The latest specification is version 0.9 dated 17 August 1999. Does anybody know further links or resources which might be helpful? Currently I'm aware of the IBM/alphaWorks VoiceXML tool [2]. In addition Motorola announces VoxML [3], but it's not clear to me which relationship exists between VoiceXML and VoxML. Motorola is a member of the VoiceXML Forum. I would be happy if anyone has more information on this subject to share with me. Thanks in advance and best regards, Oliver [1] http://www.voicexml.org/ [2] http://www.alphaworks.ibm.com/tech/voicexml/ [3] http://www.voxml.com/voxml.html /-------------------------------------------------------------------\ | ob|do Dipl.Inf. Oliver Becker | | --+-- E-Mail: obecker@informatik.hu-berlin.de | | op|qo WWW: http://www.informatik.hu-berlin.de/~obecker | \-------------------------------------------------------------------/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jesmith at kaon.com Wed Dec 8 15:00:22 1999 From: jesmith at kaon.com (Joshua E. Smith) Date: Mon Jun 7 17:18:27 2004 Subject: SAX/C++: First interface draft In-Reply-To: References: <3.0.6.32.19991206150103.009a1c10@mailhost> Message-ID: <3.0.1.32.19991208095604.0105cb40@tiac.net> >> We're using MSVC 6 here, and basic_string<> seems fine. > >It's not. See eg. > http://msdn.microsoft.com/visualc/stl/faq.htm#Q4 Actually, it is fine. I ran the test program from that faq in the debugger (stepping thru all the template code), and they clearly fixed this problem in the transition from VC5 to VC6. -Joshua Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Wed Dec 8 14:59:57 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:27 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies References: <3.0.1.32.19991208093643.00697d64@tiac.net> Message-ID: <384E72AC.8B02D25F@fxtech.com> > Is holding the document in memory not an option? Because if you can hold > it all in memory, it's simple enough to build a tree representation of the > thing, then walk the tree with your handlers, then throw away the tree. I thought about that at first, and I could do it easily over expat (or SAX). In fact, I could even keep the same API I came up with, because then I'd just be dealing with a DOM-like representation. However, for "large" data-files (several megabytes worth), this could be a problem, because the in-memory representation could be many megabytes larger (consider a 3D model with 10,000 vertices, each one with a pair). I suppose I could minimize the amount of extra memory usage by using hash tables, but I tend to prefer to streaming solution. I was going under the assumption that for this type of use, namespaces and validation probably aren't necessary, so there aren't that many advantages to layering over expat. > If that isn't an option, then I agree that you have a combination of: > 1) A nice interface paradigm (you could even generate a stub of the program > from a DTD!); and, > 2) Quite a challenge getting an event-based parser to work with it because > of control-flow issues. Thanks, and yeah, it would be/is a real effort to match this idiom to expat. > Here's a nutty idea -- Try threads. Run the parser and expat as two Ouch! :-) You get an 'A' for creativity! -Paul -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Wed Dec 8 15:23:44 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:27 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies References: <3.0.1.32.19991208093643.00697d64@tiac.net> Message-ID: <384E783F.38000C5E@fxtech.com> > Is holding the document in memory not an option? Because if you can hold > it all in memory, it's simple enough to build a tree representation of the > thing, then walk the tree with your handlers, then throw away the tree. I started thinking about this some more. If I could build a light-weight in-memory representation of the XML file, then I could build an "even simpler" DOM-like interface as an option. So if you didn't want to use the callback-based "discovery" API that I outlined previously, you can use an alternative iterator-based API that lets you avoid the callbacks and just iterate over elements you are interested in. Here is my previous code, rewritten for a similar iterator-based API. As you can see, it avoids the static callback functions and a lot of the extra boiler-plate code, but the API is very similar. Thoughts? Document *App::LoadDocument(const char *path) { XML::Input file(path); XML::Element elem = file.GetElement("Document"); Document *doc = new Document(elem.GetAttribute("name")); // now iterate over 'Layer' elements XML::Element::iterator it; for (it = elem.begin("Layer"); it != elem.end(); ++it) { Layer *layer = new Layer((*it).GetAttribute("name")); layer->Parse(*it); doc->AddLayer(layer); } AddDocument(doc); return doc; } void Layer::Parse(XML::Element &elem) { // look for (required) size element mSize.Parse(elem.GetElement("size")); // look for object elements XML::Element::iterator it; for (it = elem.begin("Object"); it != elem.end(); ++it) { Object *obj = ObjectFactory::Create((*it).GetAttribute("type")); obj->Parse(*it); AddObject(obj); } } void Size::Parse(XML::Element &elem) { sscanf(elem.GetData(), "%dx%d", &width, &height); } -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Wed Dec 8 15:40:27 1999 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:18:27 2004 Subject: A question on nomenclature Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17034@martinique> > -----Original Message----- > From: James Tauber [mailto:jtauber@jtauber.com] > Sent: Tuesday, December 07, 1999 7:21 PM > > > > > > > > > > > > > > > (James Tauber described this as a "schema-by-example") but > what I really > > want is the name that I would call that "class" of XML documents. > > In linguistic terms, you have a "grammar" defining a > "language" which is > really just a set of "utterances". > > In XML, a "grammar" is generally called a "schema" and an utterance is > called an "instance". So what you are asking, if I understand > correctly, is > what is the term corresponding to "language". > I'm not familiar with this definition of "schema", but then I haven't been able to follow the discussion on XML Schema Stuff very closely. "Grammar" to me suggests syntax, although probably it should mean the whole ball of wax - syntax, semantics, lexis, etc. But "schema" to me means (roughly) "typed", and thus a mapping from syntactic structures to values, which is extra-syntactic. In fact I'd argue that XML _syntax_, strictly speaking, determines only which sentences are legal in the language, and doesn't even map (concrete) syntactic structures to abstract ones, which is a kind of semantics. Well, it does, but very informally and with some ambiguities. > The term most consistent with the XML 1.0 REC would probably > be "document > type". > > So you would say you have a "schema" defining a "document > type" which is > really just a set of "instances". > I'm confused by David's example - it clearly can only be construed as an instance in XML terminology. One can infer any number of DocTypes (=languages, grammars) from it, but there is nothing in the example to support choosing one such language over any other. Also, based on his post from yesterday, it sounds like he's thinking of a set with only one member. > > 1. Yes, people get confused between a schema and a schema > language and use > "schema" to mean both. > The whole complex of schema-related terms looks terribly ill-defined to me. Naturally I've got my own little set of definitions, but can you point me to what you would consider the clearest and most authoritative? (Remember I'm often unable to follow xml-dev closely, so please copy me if you respond.) > 2. There is a distinction between a schema and the set of > valid documents > for that schema (ie a "document type"). It is the distinction > between a > grammar and the language it defines. So you could use the > term "schema" for > the *definition* of the set of valid documents (whether its a > DTD, a W3C XML > Schema or a schema-by-example), but the actual set of valid > documents is > best called something else (like "document type"). > I'd suggest good old ZF set terminolgy. An expression that explicitly enumerates the members of a set is called an extension expression, and an expression that logically describes the set is called a set comprehension. So "{1, 2, 3}" is an extension expr., and "{ i : Z | 0 < i < }" is a comprehension expression denoting the same set. (I believe there are some other terms in use, such as intension, but these two terms are common, and both are used in Z.) So the set of all documents that conform to a particular DTD can be considered the extension of the set defined by that DTD, which itself is analogous to a set comprehension expression - call it a Doc. or Lang. comprehension expression. > Hope this helps Ditto. -gregg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Wed Dec 8 16:32:04 1999 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:18:27 2004 Subject: A question on nomenclature Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17038@martinique> Sorry, just remembered the term > > I'd suggest good old ZF set terminolgy. An expression that explicitly > enumerates the members of a set is called an extension > expression, and an > expression that logically describes the set is called a set > comprehension. > So "{1, 2, 3}" is an extension expr., and "{ i : Z | 0 < i < }" is a > comprehension expression denoting the same set. (I believe > there are some > other terms in use, such as intension, but these two terms > are common, and > both are used in Z.) "Construction" is the other term I should have mentioned. You might find Z's usage illuminating. In Z, a schema is rigorously defined as a named set of bindings, where a binding is function (set of ordered pairs, not an algorithm) from names to values. (They're also typed, so each schema has a signature, defined as a function from names to types; the values in the bindings must be of the appropriate type.) There are several ways to express a schema, but basically you can either write a construction expression or an extension. A schema construction expression looks something like: +--[ FOO ]---- | i : Z +------------- | 0 < i < 4 +------------- meaning the name "FOO" is bound to the set of bindings of the name "i" to integral (because of the type declaration using "Z") values satisfying the predicate 0, <| i == 2 |>, <| i == 4 |> } "FOO" itself can be used as a type, as in the expression "f : FOO"; dot notation is used to access the "components" of a schema: "f.i". What does this have to do with XML, you ask? Well, nuttin' right now, but it's possible to use Z's rigorous semantics to define other languages, e.g. XML-langauges; some day in the next millenium my pet project of expressing a typed semantics for XML stuff using Z will bear fruit. Maybe. On the other hand, if "schema" is properly construed in terms of semantic mappings, then Z provides a very handy, very carefully defined meta-language for that right now. -gregg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Wed Dec 8 17:10:44 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:27 2004 Subject: Schema validation of XSLT, SVG, XPath : Part 1 Proposal for lists Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415561@THOR> A few weeks ago, Tim Berners-Lee strongly suggested that other XML technologies start using XML Schema. I have reviewed several of the other XML technologies and believe with some minor enhancements, XML Schema can do effective validation of these technologies. I've broken up the necessary modifications into several different messages, so that each can be independently considered and reviewed, however I see all of them as necessary, reasonable and easy to implement with minimal additional effort. I would appreciate any comments: The following note was written after reviewing XSLT but before reviewing SVG. SVG makes much more extensive use of lists and so I believe its adds even more compelling justification for the proposal. In XSLT, there are numerous uses of space separated lists, two of which cannot be addressed with the DTD compatibility NMTOKENS list type. This message identifies them, proposes an additional element for XML Schema Datatypes that would address delimited lists in a minimally distruptive manner that would be generally useful and then presents schema fragments for the XSLT elements. I believe this is a compelling (even demanding) argument for inclusion of list support in the initial version of XML Schema. 1. List usage in XSLT: b) add to datatype element and dataQual archetype .... ... c) Add a couple of new built-in datatypes (though not essential, but generally useful). (These are also replicated in a following comment on additional datatypes.) 1 1 [^:]* d) remove special narrative about NMTOKENS and IDREFS and redefine NMTOKENS and IDREFS as: 3. Use of list element in XSLT schema \* .... .... 4. Processing The following seems a reasonable processing mechanisms for list (when separator="," for clarity) do complete production pattern for basetype if ignoreWhitespace is true match the following regex [&x0A&0x09&0x0D ]*,[&x0A&0x09&0x0D ]* else match , end if loop while there is a match 5. Examples of processing Example a: Processing any fragment (including the following): This, is, only, has, one, item, since, nothing, terminates, the, string, production will return a one item list since nothing terminates the string production. Example b: "[^"]*" Processing the following fragment will result in two items "I can have my seperator (,) in here since","nothing had terminated my production" The comma in parenthesis is not processed as an item seperator since it was encountered in the scope of the production pattern for quoted string. Example c: Processing the following fragment: 3.1415926, 2.718, 1.414 Would result in a validation error, since the space between the first comma and second number does not match the float production. If the list element had been , then it would return 3 items. 3.1415926,,1.414 Would also be a validation error, since the null string between the two comma's does not match the float production. 6. Accessing lists through a type-aware DOM I definitely think that trying to define how a type-aware DOM would access provide access to list data is outside the scope of the schema work. However, it would not appear that adding generic lists would add any new issues to that work project since they would have to address how to provide access to the compatibility lists of NMTOKENS and IDREFS. There solution to that problem could be as easy as saying that their is no native type support for lists and you can only get the entire string back. However you will have been assured that the string meets your production requirements. 7. Additional burden on schema validation code I believe the additional burden on validation authors would be minimal since the generic list validation code can replace any IDREFS or NMTOKENS validation code. I would appreciate any comments from the Xerces or other schema parser initiative team on their accessment of the additional development burden. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From runnable at hotmail.com Wed Dec 8 17:20:56 1999 From: runnable at hotmail.com (Bent Rasmussen) Date: Mon Jun 7 17:18:27 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies Message-ID: <19991208172022.4168.qmail@hotmail.com> >Thoughts? yes - one > Layer *layer = new Layer((*it).GetAttribute("name")); > layer->Parse(*it); > doc->AddLayer(layer); Why do you call a parse method outside of Layer? The parse method might be there but it seems to me that giving the constructor the whole DOM node will reduce complexity and since it is implied that the object should use the information during construction to build its internal state - it might as well just start off by parsing the node during the actual construction. If you had a method that returned the object state (fx a wrapper object with a DOM node containing state information) you could then easily throw in a history mechanism for your program (by letting the document object holding the shapes catch and reset states of objects in a sequential manner). I'm rookie (only know about Java) but I think it makes sense, and hope it does since I intend redesigning my own java-based drawing application this way; using XML for serialization syntax and feeding/outputting it directly to/from the objects using it. ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Wed Dec 8 17:20:17 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:27 2004 Subject: Schema validation of XSLT, SVG, XPath : Part 1 Proposal for lists Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415563@THOR> A few weeks ago, Tim Berners-Lee strongly suggested that other XML technologies start using XML Schema. I have reviewed several of the other XML technologies and believe with some minor enhancements, XML Schema can do effective validation of these technologies. I've broken up the necessary modifications into several different messages, so that each can be independently considered and reviewed, however I see all of them as necessary, reasonable and easy to implement with minimal additional effort. I would appreciate any comments. The following note was written after reviewing XSLT but before reviewing SVG. SVG makes much more extensive use of lists and so I believe its adds even more compelling justification for the proposal. The datatype draft explicitly defers addressing compound types to the next revision of Schema, however I believe that lists are so essential to validating these significant XML technologies and so generally useful that they should be addressed in the initial recommendation. In XSLT, there are numerous uses of space separated lists, two of which cannot be addressed with the DTD compatibility NMTOKENS list type. This message identifies them, proposes an additional element for XML Schema Datatypes that would address delimited lists in a minimally distruptive manner that would be generally useful and then presents schema fragments for the XSLT elements. I believe this is a compelling (even demanding) argument for inclusion of list support in the initial version of XML Schema. 1. List usage in XSLT: b) add to datatype element and dataQual archetype .... ... c) Add a couple of new built-in datatypes (though not essential, but generally useful). (These are also replicated in a following comment on additional datatypes.) 1 1 [^:]* d) remove special narrative about NMTOKENS and IDREFS and redefine NMTOKENS and IDREFS as: 3. Use of list element in XSLT schema \* .... .... 4. Processing The following seems a reasonable processing mechanisms for list (when separator="," for clarity) do complete production pattern for basetype if ignoreWhitespace is true match the following regex [&x0A&0x09&0x0D ]*,[&x0A&0x09&0x0D ]* else match , end if loop while there is a match 5. Examples of processing Example a: Processing any fragment (including the following): This, is, only, has, one, item, since, nothing, terminates, the, string, production will return a one item list since nothing terminates the string production. Example b: "[^"]*" Processing the following fragment will result in two items "I can have my separator (,) in here since","nothing had terminated my production" The comma in parenthesis is not processed as an item separator since it was encountered in the scope of the production pattern for quoted string. Example c: Processing the following fragment: 3.1415926, 2.718, 1.414 Would result in a validation error, since the space between the first comma and second number does not match the float production. If the list element had been , then it would return 3 items. 3.1415926,,1.414 Would also be a validation error, since the null string between the two comma's does not match the float production. 6. Accessing lists through a type-aware DOM I definitely think that trying to define how a type-aware DOM would access provide access to list data is outside the scope of the schema work. However, it would not appear that adding generic lists would add any new issues to that work project since they would have to address how to provide access to the compatibility lists of NMTOKENS and IDREFS. There solution to that problem could be as easy as saying that their is no native type support for lists and you can only get the entire string back. However you will have been assured that the string meets your production requirements. 7. Additional burden on schema validation code I believe the additional burden on validation authors would be minimal since the generic list validation code can replace any IDREFS or NMTOKENS validation code. I would appreciate any comments from the Xerces or other schema parser initiative team on their accessment of the additional development burden. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Wed Dec 8 17:24:07 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> Message-ID: <000001bf41a1$2242b380$099918d1@docuverse1> >This is information for a specific kind of XML processor >(an indexing robot), but it is not specific to the document >type. So we need a mechanism that applies to any XML document >and can be automatically ignored by non-robot processors. >A PI is an exact fit. Even the name is right -- it is an >instruction to the robot about how to process it. > >The alternative, adding an element to every DTD in the >universe, with the corresponding breakage to every processor >that reads those DTDs, is just too awful to contemplate. If you intend the indexing PI to be used by document creators, it is not unreasonable to expect them to include it in their DTD. Frankly, this is one of the reasons I do not like to use DTD. I am in favor of adopting the policy of ignoring foreign elements and attributes for extensibility. Absolute ordering of elements also detracts from extensibility. Relative ordering of elements is fine though. My ideal solution for this problem is a small set of elements that can be embedded into documents. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Wed Dec 8 17:37:14 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:27 2004 Subject: RFC: "even simpler" C++ XML parser for object hierarchies References: <19991208172022.4168.qmail@hotmail.com> Message-ID: <384E9784.24037450@fxtech.com> > > Layer *layer = new Layer((*it).GetAttribute("name")); > > layer->Parse(*it); > > doc->AddLayer(layer); > Why do you call a parse method outside of Layer? The parse method might be > there but it seems to me that giving the constructor the whole DOM node will > reduce complexity and since it is implied that the object should use the > information during construction to build its internal state - it might as > well just start off by parsing the node during the actual construction. If That's a good point. One advantage to doing it this way is your objects do not *necessarily* need to know anything about the actual parsing mechanism. You can call an object parser that parses out the required attributes and entities on behalf of the object, and then call normal object methods to get it into the state you want. Both ways could be utilized trivially. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Wed Dec 8 18:38:02 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:27 2004 Subject: Schema validation of XSLT, SVG, XPath: Part 2: Multiple Lexical R epresentation Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415565@THOR> Sorry about the duplicate (or near duplicate) Part 1 messages. I'm not really sure how that happened. This part makes some simple modifications that greatly simplify specifying the lexical represention of data types that have several forms. Here are some productions that would be difficult to enforce without the suggested modifications.: NameTest from XPath: NameTest ::= '*' | NCName ':' '*' | QName NCName from XML Namespaces: 4] NCName ::= (Letter | '_') (NCNameChar) The SVG path data datatype (datatype of the d attribute) Proposal: After reviewing the current datatypes doc, I'm a little confused with what happened with the previous lexicalRepresentation element. The interpretation of pattern and lexical are not adequately discussed. I'm moving more things around than I thought that I would need to, but here goes. Here are what I think would be reasonable renderings of the previous production patterns. \* : \* [^:].*[^:] [^:] [^:]* [Mm] [Mm][^Mm]* ... omited for other SVG productions ... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Wed Dec 8 21:38:01 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots In-Reply-To: <000001bf41a1$2242b380$099918d1@docuverse1> References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> Message-ID: <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> At 09:24 AM 12/8/99 -0800, Don Park wrote: > >If you intend the indexing PI to be used by document >creators, it is not unreasonable to expect them to >include it in their DTD. > >Frankly, this is one of the reasons I do not like to >use DTD. I am in favor of adopting the policy of >ignoring foreign elements and attributes for extensibility. >Absolute ordering of elements also detracts from >extensibility. Relative ordering of elements is fine >though. > >My ideal solution for this problem is a small set of >elements that can be embedded into documents. Adding the robots info to every DTD in the world requires unanimous agreement. Adding a PI requires non-interference with other PIs, a vastly simpler task. Waiting for XML to support mixin vocabularies and for those to be widely used, could take a few years. So the element-based approach just doesn't fit my definition of "ideal". That was the approach I originally thought about, but there were just too many obstacles for it to succeed. So, though the element-based approach might be more comfortable for authors, the robots PI fits both the letter and the intent of PIs in the XML spec and it does the job. If it is blessed as a standard, it sure would be easy for an XML editor to add a dialog box to generate it. That would be even easier for authors. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Wed Dec 8 22:27:16 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> Message-ID: <000201bf41cb$7ab65740$099918d1@docuverse1> >Adding the robots info to every DTD in the world requires >unanimous agreement. Adding a PI requires non-interference >with other PIs, a vastly simpler task. Waiting for XML >to support mixin vocabularies and for those to be widely >used, could take a few years. So the element-based approach >just doesn't fit my definition of "ideal". That was the >approach I originally thought about, but there were just >too many obstacles for it to succeed. It is the 'interference' notion that I am against because it causes loss of extensibility. This notion creates an imbalance between document creator and subsequent document processors because any change to the document would have to use arcane XML features like PI to avoid 'interfering' with the original document. Perhaps a more general placeholder standard for tags such as your indexing tag/PI is what we need. For example: In the DTD, meta:header content is declared to be ANY so that any meta-info tags such as your index tag can be dropped in. Such a general proposal would have far better chance of being adopted than a more specific proposal. Once it is adopted, everyone can use it to drop in tags for their own purpose. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From francis at redrice.com Thu Dec 9 00:29:46 1999 From: francis at redrice.com (Francis Norton) Date: Mon Jun 7 17:18:27 2004 Subject: XML. SAX. Streaming processing with Groves. References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> <011101bf3d53$60df73a0$e5d88dce@WORKGROUP> Message-ID: <384ADEEF.74C1E4F1@redrice.com> XPath seems to have missed the boat for DOM level 2. Is there any chance that XPath will be included in level 3? I can see that it doesn't appear to fit in to the roadmap, but as someone who does commercial program-to-program programming I would find not only the basic functionality but access to the neater data model a real aid to productivity. Francis. Michael Champion wrote: > ... > > The DOM WG will be defining the requirements for Level 3 over the next 6 > weeks or so. Standard APIs for loading, saving, parsing, and serializing > XML text are "must have" items for Level 3, and this issue (that an > application may want access to the elements of a document before it is fully > parsed) has come up. For example, a programmer might choose not to continue > parsing some huge document after the necessary data were found. > > Concrete suggestions for actual APIs or pointers to APIs that allow this > would be appreciated. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mike.Champion at softwareag-usa.com Thu Dec 9 01:05:16 1999 From: Mike.Champion at softwareag-usa.com (Michael Champion) Date: Mon Jun 7 17:18:27 2004 Subject: XML. SAX. Streaming processing with Groves. References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> <011101bf3d53$60df73a0$e5d88dce@WORKGROUP> <384ADEEF.74C1E4F1@redrice.com> Message-ID: <009801bf41e1$2e0ffcf0$5dbdb3c7@WORKGROUP> ----- Original Message ----- From: Francis Norton To: Michael Champion Cc: Sent: Sunday, December 05, 1999 4:53 PM Subject: Re: XML. SAX. Streaming processing with Groves. > XPath seems to have missed the boat for DOM level 2. Is there any chance > that XPath will be included in level 3? I can see that it doesn't appear > to fit in to the roadmap, but as someone who does commercial > program-to-program programming I would find not only the basic > functionality but access to the neater data model a real aid to > productivity. I agree ... and will forward this suggestion to the DOM working group. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sunker at telkom.net Thu Dec 9 02:46:53 1999 From: sunker at telkom.net (sunker@telkom.net) Date: Mon Jun 7 17:18:27 2004 Subject: XSLT with DOM Message-ID: <61D3A6AB14FED211856500001C055D9633E042@FS01> Hi all, I got a problem with Dom doc with xsl transform in my xml i included the DTD with the entity references to some file process such as test.asp, (i parse with GenXMLToHTML(xml,xsl)) when it display to the web browser, its blank...!!. but when i include stylesheet directly to xml file it work properly. why ?.. Is it possible xml cannot load the process under process ? or the xsl is worst processor ? for more info this the example: DOM TRANS XML TO HTML: function GenXMLToHTML(xmlf,xslf) { xmlfile = new ActiveXObject("Microsoft.XMLDOM"); xmlfile.async = false; xmlfile.validateOnParse = false; xmlfile.load(Server.MapPath(xmlf)); xslfile = new ActiveXObject("Microsoft.XMLDOM"); xslfile.async = false; xslfile.validateOnParse = false; xslfile.load(Server.MapPath(xslf)); Response.Write(xmlfile.transformNode(xslfile)); } ======================================= XML FILENAME = TEST.XML ]> &port; ======================================= XSL FILENAME = TEST.XSL Account
childNumber(this)-
======================================= ASP FILENAME = TEST.ASP <%@ Language=JScript %> <%Response.ContentType="text/xml" msg = '' for (var i=0;i<100;i++){ msg+='Pxc'+i+''; } msg+=''; Response.Write (msg); %> thanks Sunker (this's xml page generate by GENXMLTOHTML http://www.geocities.com/researchtriangle/campus/7211) -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3484 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991209/d87ac12d/winmail.bin From tbray at textuality.com Thu Dec 9 06:04:18 1999 From: tbray at textuality.com (tbray@textuality.com) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> Message-ID: <0ab601bf420b$71f94be0$0500a8c0@ned> From: Walter Underwood > Adding the robots info to every DTD in the world requires > unanimous agreement. Adding a PI requires non-interference > with other PIs, a vastly simpler task. Waiting for XML > to support mixin vocabularies and for those to be widely > used, could take a few years. Walter is right on both counts, but I'm having trouble getting comfortable with his PI idea. Not violently against it, but two things make me uncomfortable. First of all, PIs basically suck. Having said that, if you gotta use them, this is the kind of thing to use them for. But my big problem is with the idea that individual resources ought to embed robot-steering information. It just feels like the wrong level of granularity. Either this ought to be done externally in something like robots.txt but smarter, at the webmaster/administrator level, or, with a namespaced vocabulary at the individual element level. Note that the external file and the embedded element-level stuff could have the same namespaced vocabulary. The PI has the characteristic that it *has* to be in the document and can modify *only* the whole document. Also I question the ability of authors to do the right thing with this kind of a macro-level control. Also I question the ability of robot authors to do the right thing at the individual document level. In any case, there really should be a namespace with a bunch of predeclared attributes for this purpose; then for those who want to do fancy things, they can do so in a clean way at the individual element level. For those who *don't* want to wire robot stuff into their document structure, but *do* want individual resource-level control and *don't* want to do it in a centralized way, I guess the PI is a tolerable kludge; but it doesn't seem like much more than that. Anyhow, is there enough XML on the web to make this interesting? Serious question, I don't know the answer. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Dec 9 06:11:05 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots Message-ID: <007601bf420f$a8914aa0$2cf96d8c@NT.JELLIFFE.COM.AU> From: Don Park Not Walter wrote: >>I'm not Walter, but to me this has the obvious advantage that it can >>be used completely orthogonally to the document contents and the >>software used to process the document for non-indexing purposes. > >IMHO, this line of thinking (aka 'sacred content') >forces us to use PI or special attributes for >extension of document instances. Poor use of >the letter 'X' in XML. But thinking (methodology) forces us to have a need; a markup language either supports that need or not. Having PIs does not force anyone to use them. At www.apache.org, the first design coccon uses PIs, the second design does not. The comments are interesting and useful for why, but they also make the same mistake of saying that because they have moved to a system complexity where PIs are not needed, therefore PIs are bad; this is despite them using them in their first system. So PIs, at least, provide an alternative from that many designers find natural. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Dec 9 06:40:13 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots Message-ID: <00cd01bf4213$c9c4e840$2cf96d8c@NT.JELLIFFE.COM.AU> From: tbray@textuality.com >Walter is right on both counts, but I'm having trouble getting comfortable >with his PI idea. Not violently against it, but two things make me >uncomfortable. First of all, PIs basically suck. Having said that, if you >gotta use them, this is the kind of thing to use them for. If PIs suck, then perhaps they suck in the same way that using #defines in C++ does or the SQLJ preprocessor does: it can be a sign of insufficient analysis in the whole system (perhaps for legitimate reasons: the need may have emerged over time) or because of habit or to clearly demarcate different processing inputs to simplify subsequent phases or because of a deficiency in the underlying language. But this is not to allow that PIs suck in the first place. Actually, to use the C++, I think PIs correspond to pragmas more than anything. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From efren at banesto.es Thu Dec 9 09:33:24 1999 From: efren at banesto.es (Efren) Date: Mon Jun 7 17:18:27 2004 Subject: Manual XML ? Message-ID: <384F76E0.265E88C8@banesto.es> Hola a todos, puede alguien decirme donde puedo encontrar un buen manual de XML? Gracias xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rev-bob at gotc.com Thu Dec 9 10:28:32 1999 From: rev-bob at gotc.com (rev-bob@gotc.com) Date: Mon Jun 7 17:18:27 2004 Subject: A processing instruction for robots Message-ID: <199912090527787.SM01128@Unknown.> I suppose this is where I'm supposed to come in.... ;) > > Adding the robots info to every DTD in the world requires > > unanimous agreement. Adding a PI requires non-interference > > with other PIs, a vastly simpler task. Waiting for XML > > to support mixin vocabularies and for those to be widely > > used, could take a few years. > > Walter is right on both counts, but I'm having trouble getting comfortable > with his PI idea. Not violently against it, but two things make me > uncomfortable. First of all, PIs basically suck. Having said that, if you > gotta use them, this is the kind of thing to use them for. Agreed. This is an instruction to a specific class of processor, hence it's a good fit as a PI from that angle. > But my big problem is with the idea that individual resources ought to embed > robot-steering information. It just feels like the wrong level of > granularity. Either this ought to be done externally in something like > robots.txt but smarter, at the webmaster/administrator level, or, with a > namespaced vocabulary at the individual element level. This has been tried. The problem is that the current robots.txt idea just doesn't work for everybody - robots.txt is supposed to reside in the domain's root [1], and not everybody has that access. (Big examples: Geocities, Angelfire, Tripod, AOL....) Granted, a tweak to that specification that would allow local copies of robots.txt to affect their subdirectory tree would be *most* helpful in that regard, but that just doesn't exist. [1] - See http://info.webcrawler.com/mak/projects/robots/norobots.html under "The Method" header. The filename is "/robots.txt" - which forces the file into the root. Because of this overwhelming gap, there's a hack in HTML that uses META to granularize this at a per-document level, and a few bots are good about obeying that syntax. > The PI has the characteristic that it *has* to be in the document and can modify > *only* the whole document. Also I question the ability of authors to do the right > thing with this kind of a macro-level control. I do it all the time. In fact, I have a default value I can specify in my templates. (Yes, I could use a robots.txt file - the current method is a holdover from before I had a domain name for my site.) > Also I question the ability of robot authors to do the right thing at the individual > document level. That's already a current issue. Bot authors who are conscientious enough to obey the META hack will have no problem modifying their source to obey the XML PI as well; it's a trivial transformation. (Especially if the PI uses syntax that's as close to the META version as possible!) > In any case, there really should be a namespace with a bunch of predeclared > attributes for this purpose; then for those who want to do fancy things, > they can do so in a clean way at the individual element level. Fine - swipe the existing values and go from there. The fewer changes made, the better - from all viewpoints. Not only will there be fewer deltas for page authors to learn, but bot authors will be better able to just reuse existing META code to accomodate the PI. Note that I'm not saying that a local robots file wouldn't be a wonderful idea - just that since you currently have only the choices of "global" and "per document" with HTML, you ought to have *at least* those same choices with XML. A local robots.txt would be tasty gravy indeed. > Anyhow, is there enough XML on the web to make this interesting? Serious > question, I don't know the answer. -T. I have enough X(HT)ML up to be very interested in this matter - and there's only going to be more online as the spec progresses. Why not address the issue *before* there's a huge amount of X(HT)ML online, instead of waiting until a few assorted hacks come up? Rev. Robert L. Hood | http://rev-bob.gotc.com/ Get Off The Cross! | http://www.gotc.com/ Download NeoPlanet at http://www.neoplanet.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Andy.Bradbury at syntegra.bt.co.uk Thu Dec 9 11:04:57 1999 From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk) Date: Mon Jun 7 17:18:27 2004 Subject: LINK.VBS Message-ID: <65AF45D5E535D2118AFB0008C7FA2318035A9B01@FL-EXCHANGE-03> With regard to the LINK.VBS virus, it seems it is another "send yourself to everyone on the current victim's mailing list"-type virus - which means it *could* come from a source that is normally quite unimpeachable. The message below is a useful follow-up to the original warning: ------------------------------------------------------------------------ If you were to double click on the attachment, I suspect that the VB Script would access files on your hard drive and mail copies of itself to addresses your Contacts folder. I did the wrong thing and got a porno site and a desktop icon. The virus detector detected the virus and deleted c:\WINDOWS\TEMP\LINK.VBS c:\WINDOWS\SYSTEM\RUNDLL.VBS being the only two contaminated files. If you got this virus then delete the email, delete the desktop icon and delete the above two files and hope it is not anywhere else. It came from somewhere I would have normally trusted. So be warned Regards Trevor Croll ---------------------------------------------------------------------------- - Regards Andy B. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Thu Dec 9 11:52:24 1999 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:18:27 2004 Subject: LISTADMIN: Archive of XML-DEV, 1997-1999 Message-ID: Several people have reported the index of XML-DEV found on the page http://www.lists.ic.ac.uk/hypermail/xml-dev/ is broken. Whilst we endeavour to get this fixed, please note that another index of the forum is at http://www.xml-cml.org/search.html Also, I have created a "sherlock" plug-in for this latter search at http://www.ch.ic.ac.uk/chemime/chemdig/xmldev.src.hqx On this latter theme, can anyone remind me whether a "channel" of sites with indexed content might have been created by anyone, ie it would be useful to search the dozen on so sites related to XML in parallel using a "channel" ? Thus the above plugin would be one component of such a channel. The down side is that the above plug is not cross platform ie it only works with MacOS. Cross platform suggestions for the above (based on XML which is an obvious way of doing it) are most welcome. Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax) Dept. Chemistry, Imperial College, London, SW7 2AY, UK. http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fujisawa at the.canon.co.jp Thu Dec 9 12:06:43 1999 From: fujisawa at the.canon.co.jp (Jun Fujisawa) Date: Mon Jun 7 17:18:28 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: <14407.1389.659881.147338@localhost.localdomain> References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <3.0.32.19991202141224.0148fc60@pop.intergate.ca> Message-ID: At 6:49 PM -0500 99.12.2, David Megginson wrote: > > At 04:27 PM 12/2/99 -0500, David Megginson wrote: > > Good idea, one question. Any way to do C at the same time? -Tim > > Sure -- is there a strong need for a common C interface, though? We > already have Expat's C interface, and I don't know of anyone else in > that space yet. Gnome libxml and Oracle XML Parser for C do have SAX interface in C. Another interesting work is the Simple API for CSS (SAC). The SAC interface is defined both in Java and C. I think the combination of SAX and SAC might be very attractive (especially in C binding) while developing XML software on resource constrained environment, such as XHTML Basic user agents. -- Jun Fujisawa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Thu Dec 9 12:46:56 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <0ab601bf420b$71f94be0$0500a8c0@ned> Message-ID: <001201bf4243$8f7959c0$099918d1@docuverse1> Tim Bray wrote: >But my big problem is with the idea that individual resources >ought to embed robot-steering information. It just feels like >the wrong level of granularity. Either this ought to be done >externally in something like robots.txt but smarter, at the >webmaster/administrator level, or, with a namespaced vocabulary >at the individual element level. You are assuming that these resources exists somewhere where an external resource like robots.txt can coexist relative to the target resources. I would much more prefer an arrangement where I have the option of embedding, linking, or sequencing. By sequencing, I mean document transmission order specifies the relationships between documents. If only a single one-way communication channel is available, then meta-info document can be sent just ahead of the target document. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Dec 9 13:33:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:28 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: roddey@us.ibm.com's message of "Tue, 7 Dec 1999 13:40:43 -0700" References: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com> Message-ID: roddey@us.ibm.com writes: > 11) The class names (since we can't afford to use C++ namespaces) should be > expanded to include a SAX prefix to avoid clashes. So SAXParser and > SAXLocator and SAXAttributeList and so on. Is it true that C++ namespaces are still a problem on any platform? I know that they actually do work under Windows, and the newer EGCS/GCC have supported them for a while for all *nix variants (including Linux) -- is it the Mac that doesn't have a proper C++ compiler yet? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdr at camsoft.com Thu Dec 9 14:02:26 1999 From: sdr at camsoft.com (Stewart Rubenstein) Date: Mon Jun 7 17:18:28 2004 Subject: Request for Discussion: SAX 1.0 in C++ In-Reply-To: Message-ID: <003101bf424e$9ed143a0$a66d70c6@camsoft.com> David Megginson writes: > is it the Mac that doesn't have a proper C++ compiler yet? No. The mac has a great C++ compiler from Metrowerks (now a subsidiary of Motorola). -Stew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Thu Dec 9 15:56:29 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:18:28 2004 Subject: nestable C/C++ parser Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415572@THOR> I think your original request got lost in a side track. If is very possible to do what you want with expat. The trick is the use of the XML_SetUserData and the userdata argument. Basically, the trick is to create a base class that has methods that you want to change the behavior of (typically, StartElement and EndElement). Create derived classes for each different behavior that you want. Call XML_SetUserData to the initial handler object. In your StartElement callback, cast the userdata argument up to a pointer to your base class and call its startElement virtual method. If you want to change the handler, make another call to XML_SetUserData. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sb at metis.no Thu Dec 9 16:12:56 1999 From: sb at metis.no (Steinar Bang) Date: Mon Jun 7 17:18:28 2004 Subject: SAX and In-Reply-To: Lars Marius Garshol's message of "07 Dec 1999 10:46:35 +0100" References: Message-ID: >>>>> Lars Marius Garshol : > It depends on the situation. In the XSA client, which needs to > accept both XSA and OSD documents, but can't tell them apart before > parsing begins, uses a DispatchingDocHandler, which has a hash of > DocumentHandlers keyed on the name of the document element. In this > very restricted case that worked just fine. > In other cases one might perhaps key on the namespace of the > document element, and with SAX 2 one could use the public identifier > of the DOCTYPE declaration. I thought of something like this, but I couldn't decide on whether to use a DocumentHandler or use a buffering handler that would just buffer up the text and try matching in the buffered text until it found something to dispatch on, sending all the buffered text as the initial input to the XML parser. It looks like a dispatching from a DocumentHandler is the best idea, but then I need to be able to queue up SAX DocumentHandler events to send to the actual DocumentHandler when I start it. Hm... maybe an clone() function an a virtual destructor are in order for the C++ AttributeList class...? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at waena.edu Thu Dec 9 16:28:43 1999 From: xml at waena.edu (xml) Date: Mon Jun 7 17:18:28 2004 Subject: NDATA,XPointer, XLink Confusion Message-ID: <000701bf4262$603864c0$1d5afea9@adtech.internet.ibm.com> Hi all, This is a newbie-ish question. My servlet accepts XML files which have in them CDATA of base64 encoded binary data (images, sounds, movies, etc). I take that CDATA, extract it and save it as a separate file in the filesystem, but now I need to add and NDATA statement that points to that file. Perhaps I should be using XPointers or XLinks? In any case, the element that holds the CDATA, called is of CDATA type. I can change that to NDATA and plug in the reference, but do I have to decalre the actual NDATA in the DTD? I can't do this (one DTD for many files, all of which contain different binary data). If I use XPointers or XLinks, do XML parsers automatically inster the binary data they point to? If so, why would anyone use NDATA to point to external binary files? Thanks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 9 17:48:32 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <0ab601bf420b$71f94be0$0500a8c0@ned> References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> Message-ID: <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com> At 04:54 PM 12/8/99 -0800, tbray@textuality.com wrote: > >But my big problem is with the idea that individual resources ought >to embed robot-steering information. It just feels like the wrong >level of granularity. For really picky indexing and searching, it is wrong. Structural markup opens up some really nice possibilities. An indexer might weight the bibliography less and the abstract more, for example. But that sort of tweakiness changes for each search engine. So I'd implement that as a DTD-specific configuration in each engine, rather than trying to add processor-specific markup to each document. In fact, I already implemented it that way. Use the structure, Luke. On the plus side, XML tends to be content-rich, without navbars and decoration. This means that you get better quality results without resorting to tweaks. For example, you can actually search for "Home", "Copyright", or "Help" and get relevant results. >... The PI has the characteristic that it *has* to be in >the document and can modify *only* the whole document. Also I >question the ability of authors to do the right thing with this >kind of a macro-level control. Also I question the ability of robot >authors to do the right thing at the individual document level. I'm willing to trust the authors and webmasters. There are a lot of professionals out there. As for robot authors, if the robots PI semantics are the same as the HTML robots meta tag semantics, it should be pretty easy to get right. If they are different, all bets are off. >Anyhow, is there enough XML on the web to make this interesting? >Serious question, I don't know the answer. -T. We're seeing XML-backed websites where they want to index the XML, but serve URLs pointing to the formatted HTML. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 9 17:54:46 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <00cd01bf4213$c9c4e840$2cf96d8c@NT.JELLIFFE.COM.AU> Message-ID: <3.0.5.32.19991209095252.00b32350@corp.infoseek.com> At 03:05 PM 12/9/99 +0800, Rick Jelliffe wrote: > >But this is not to allow that PIs suck in the first place. > >Actually, to use the C++, I think PIs correspond to pragmas more than >anything. Exactly. Obviously, we need to add "#notation" to the preprocessor. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dwshin at nlm.nih.gov Thu Dec 9 18:59:32 1999 From: dwshin at nlm.nih.gov (Dongwook Shin) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com> Message-ID: <384FF7F3.1E32EDB@nlm.nih.gov> Walter Underwood wrote: > At 04:54 PM 12/8/99 -0800, tbray@textuality.com wrote: > > > >But my big problem is with the idea that individual resources ought > >to embed robot-steering information. It just feels like the wrong > >level of granularity. > > For really picky indexing and searching, it is wrong. > Structural markup opens up some really nice possibilities. > An indexer might weight the bibliography less and the > abstract more, for example. > > But that sort of tweakiness changes for each search engine. > So I'd implement that as a DTD-specific configuration in > each engine, rather than trying to add processor-specific > markup to each document. In fact, I already implemented it > that way. Use the structure, Luke. > If you see XRS (XML retrieval system), you can find that a user can give a bigger weight to an element than to another. This kind of weighting is more flexible than those by indexer. Check XRS Web demonstration system: http://dlb2.nlm.nih.gov/~dwshin/xrs.html Dongwook -- Dongwook Shin Visiting Scholar Lister Hill National Center for Biomedical Communications National Library of Medicine, 8600 Rockville Pike Bethesda 20894, MD E-mail: dwshin@nlm.nih.gov Tel: (301) 435-3257 FAX: (301) 480-3035 URL: http://dlb2.nlm.nih.gov/~dwshin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 9 20:05:26 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <384FF7F3.1E32EDB@nlm.nih.gov> References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com> Message-ID: <3.0.5.32.19991209120409.00b364b0@corp.infoseek.com> At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote: >Walter Underwood wrote: >> Structural markup opens up some really nice possibilities. >> An indexer might weight the bibliography less and the >> abstract more, for example. > >If you see XRS (XML retrieval system), you can find that a user >can give a bigger weight to an element than to another. This >kind of weighting is more flexible than those by indexer. >Check XRS Web demonstration system: >http://dlb2.nlm.nih.gov/~dwshin/xrs.html I think you are suggesting that wighting and selection should be done at query time instead of at index time. That is a design tradeoff for the search engine. But the detailed weighting and selection belong *somewhere* in the search engine rather than in every single document. I can imagine a system where each document had indexing hints scattered throughout the structure, but I can't imagine anyone having the time or knowledge to do a good job with all that markup. We have enough trouble getting people to replace "Untitled Document" in the element in HTML. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Dec 9 20:59:29 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:18:28 2004 Subject: XSLT Question: Inserting a DOCTYPE decl Message-ID: <38501883.479847CE@mitre.org> Hi Folks, I have a situation where I have many XML documents that do not contain a DOCTYPE declaration, and would like to write a stylesheet that inserts a declaration within the documents. The interesting aspect of this problem is that each XML document contains within it an element which gives the name of the DTD file. So, the declaration should use the value of that element as the name for the DTD file. For example, here's a sample XML document into which I need to insert a DOCTYPE declaration: <?xml version="1.0"?> <Numbers> <DoctypeFile>Number.dtd</DoctypeFile> <Number>27</Number> <Number>34</Number> <Number>18</Number> <Number>67</Number> <Number>99</Number> <Number>16</Number> </Numbers> Note the DoctypeFile element, which indicates the name of the DTD file. The stylesheet should insert the declaration, thus resulting in an XML document as such: <?xml version="1.0"?> <!DOCTYPE Numbers SYSTEM "Number.dtd"> <Numbers> <DoctypeFile>Number.dtd</DoctypeFile> <Number>27</Number> <Number>34</Number> <Number>18</Number> <Number>67</Number> <Number>99</Number> <Number>16</Number> </Numbers> Here's the stylesheet that I wrote to do this task: <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="doctype"> <xsl:value-of select="//DoctypeFile"/> </xsl:variable> <xsl:output method="xml" doctype-system="string($doctype)"/> <xsl:template match="*|@*|comment()| processing-instruction()|text()"> <xsl:copy> <xsl:apply-templates select="*|@*|comment()| processing-instruction()|text()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> A pretty simple stylesheet - create a variable which gets the value of the DoctyleFile element, and instruct the xsl:output element to output a DOCTYPE declaration, using the value of the variable as the name of the DTD file, and then do a copy operation on the input XML document. Here is the XML file that I get when this example is run through XT (Lotus XSL gives the same results): <?xml version="1.0"?> <!DOCTYPE Numbers SYSTEM "string($doctype)"> <Numbers> <DoctypeFile>Number.dtd</DoctypeFile> <Number>27</Number> <Number>34</Number> <Number>18</Number> <Number>67</Number> <Number>99</Number> <Number>16</Number> </Numbers> Note that the XSL processor did not evaluate the expression that I used in the xsl:output's doctype-system attribute. Instead, it used the expression literally. Thus, here are my questions: (1) Is this a bug in XT and Lotus XSL? (2) I suspect it isn't a bug, in which case can someone think of another way to solve this problem? /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stele at fxtech.com Thu Dec 9 21:26:45 1999 From: stele at fxtech.com (Paul Miller) Date: Mon Jun 7 17:18:28 2004 Subject: nestable C/C++ parser References: <61DAD58E8F4ED211AC8400A0C9B46873415572@THOR> Message-ID: <38501ED0.B8B4F9FD@fxtech.com> > I think your original request got lost in a side track. If is very possible > to do what you want with expat. The trick is the use of the XML_SetUserData > and the userdata argument. Basically, the trick is to create a base class > that has methods that you want to change the behavior of (typically, > StartElement and EndElement). Create derived classes for each different > behavior that you want. Call XML_SetUserData to the initial handler object. > In your StartElement callback, cast the userdata argument up to a pointer to > your base class and call its startElement virtual method. If you want to > change the handler, make another call to XML_SetUserData. The problem here is it requires 3 callbacks to parse a single element and restore the state. I'd like a cleaner solution. -- Paul Miller - stele@fxtech.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Veeraraghavan.Srinivasan at iac.honeywell.com Thu Dec 9 21:49:10 1999 From: Veeraraghavan.Srinivasan at iac.honeywell.com (Srinivasan, Veeraraghavan (AZ15)) Date: Mon Jun 7 17:18:28 2004 Subject: VBScript error Message-ID: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com> Hi all, I know this is off-topic. I am using Microsoft Script control to execute VB and Java scripts. This is to provide programmatic interface to execute scripts in application programs. When I load the following script using AddCode method (method on script control) in Java (VJ++ environment), I get an error that says "Expected end of statement". When I investigated into the cause of the error, I figured out that the line having "CreateObject" is causing the problem. When I removed the line, i did not get any errors/exceptions. Also, I observed that when I use createObject method inside a Subroutine I do not get any errors/exceptions. Does anybody have any idea on what I'm missing or point to appropriate resources (Is there a mailing list for Microsoft Script control?). Environment : Windows NT 4.0 SP5, IE 5.0 , VJ++ 6.0 Code snippet: function getNames(folder,subfolders) set fso = CreateObject("Scripting.FileSystemObject") set fld = fso.GetFolder(folder) i = 0 if subfolders then ReDim filenames(fld.subfolders.count-1,1) for each f in fld.subfolders filenames(i,0) = f.name filenames(i,1) = f.datecreated i = i + 1 next else ReDim filenames(fld.files.count-1,1) i = 0 for each f in fld.files filenames(i,0) = f.name filenames(i,1) = f.size i = i + 1 next end if getNames = filenames end function Honeywell Veeraraghavan Srinivasan Senior Principal Engineer Honeywell Hi-Spec Solutions 1280, Kemper Meadow Drive, Cincinnati, OH 45240 Phone: (513) 595-8913 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From AXu at epnet.com Thu Dec 9 21:59:49 1999 From: AXu at epnet.com (Amanda Xu) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots Message-ID: <E11wBbG-0005Yl-00@romeo.ic.ac.uk> Do you expect the end-user to understand term weighting techniques as well as the structure of an XML document? Elephant -----Original Message----- From: Walter Underwood [mailto:wunder@infoseek.com] Sent: Thursday, December 09, 1999 3:04 PM To: Dongwook Shin Cc: 'XML developers' list' Subject: Re: A processing instruction for robots At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote: >Walter Underwood wrote: >> Structural markup opens up some really nice possibilities. >> An indexer might weight the bibliography less and the >> abstract more, for example. > >If you see XRS (XML retrieval system), you can find that a user >can give a bigger weight to an element than to another. This >kind of weighting is more flexible than those by indexer. >Check XRS Web demonstration system: >http://dlb2.nlm.nih.gov/~dwshin/xrs.html I think you are suggesting that wighting and selection should be done at query time instead of at index time. That is a design tradeoff for the search engine. But the detailed weighting and selection belong *somewhere* in the search engine rather than in every single document. I can imagine a system where each document had indexing hints scattered throughout the structure, but I can't imagine anyone having the time or knowledge to do a good job with all that markup. We have enough trouble getting people to replace "Untitled Document" in the <title> element in HTML. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From KenNorth at email.msn.com Thu Dec 9 22:11:00 1999 From: KenNorth at email.msn.com (KenNorth) Date: Mon Jun 7 17:18:28 2004 Subject: VBScript error References: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com> Message-ID: <000601bf4291$ff60fa60$0b00a8c0@grissom> From: Srinivasan, Veeraraghavan (AZ15) > I know this is off-topic. > Is there a mailing list for Microsoft Script control?. For a discussion of scripting issues, try these newsgroups: microsoft.public.inetexplorer.ie4.scripting microsoft.public.inetexplorer.scripting Server: msnews.microsoft.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dwshin at nlm.nih.gov Thu Dec 9 22:17:12 1999 From: dwshin at nlm.nih.gov (Dongwook Shin) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots References: <199912092159.QAA22301@nes.nlm.nih.gov> Message-ID: <38502645.F62127BD@nlm.nih.gov> Amanda Xu wrote: > Do you expect the end-user to understand > term weighting techniques as well as the > structure of an XML document? > > Elephant > It depends. If a user is somewhat aware of the document structure, then he may be able to give the weight. If not, he results in totally depending on the searching strategy of the search system. So, the term weighting on the fly is an option that an expert is able to use for better precision. At the same time, it does not make it harder for novice users. Dongwook Dongwook -- Dongwook Shin Visiting Scholar Lister Hill National Center for Biomedical Communications National Library of Medicine, 8600 Rockville Pike Bethesda 20894, MD E-mail: dwshin@nlm.nih.gov Tel: (301) 435-3257 FAX: (301) 480-3035 URL: http://dlb2.nlm.nih.gov/~dwshin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Dec 9 22:25:43 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:18:28 2004 Subject: XML parser in Javascript for RDF app? (feasible?) In-Reply-To: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com> Message-ID: <Pine.GHP.4.21.9912092200150.29718-100000@mail.ilrt.bris.ac.uk> XMLists, Some time ago I remember a posting about there being an XML parser (presumable an SMLish subset?) available that was written in Javascript. Perhaps I hallucinated this! If not, I'd love to know where this work is up to, whether available opensource etc. Context: We have some RDF query / logic demos now in Javascript, that work in many pre-XML Javascript environments. The current implementation uses a simple text representation of RDF data graphs instead of an XML serialisation - I'm thinking it *might* be possible to actually parse serialised graphs from XML in Javascript, using one of the various XML graph serialisation syntaxes (RDF, BizTalk etc etc). Hence the interest in an XML parser in Javascript... I'm confident we can show simple RDF query and inference stuff clientside in Javascript, eg. for decision support apps. What I'm worried about is syntax, ie. prospects for parsing data graphs from XML clientside in 100% Javascript. (For the curious, this is based on Jan Grant's cute Javascript/Prolog hack, http://rdf.desire.org/~cmjg/test/prolog.html -- I just glued it together for rdf and made up the examples.) There's an installation running as a part of a discussion doc I put together as background context on RDF's origins... see: js rdf query demo: http://www.w3.org/1999/11/11-WWWProposal/rdfqdemo.html which is part of: http://www.w3.org/1999/11/11-WWWProposal/ thanks for any tips on the XML/js front, cheers, Dan ps. bug reports offlist please! this stuff doesn't run everywhere yet... pps. non-XML data fragment follows. Clearly I'm using the wrong kinds of brackets; suggestions welcomed... curly braces indicate a URI in the current hack..., ie we have: {relation-type-URI} ({objectURI}, {value} ). We can parse this stuff in Javascript. I'd rather use XML instead but am not sure if this is feasible... (bait for the SMLers... ;-) Excerpts from: http://www.w3.org/1999/11/11-WWWProposal/rdfqdemo.html {http://www.w3.org/1999/02/22-rdf-syntax-ns#type} ({http://www.w3.org/History/1989/proposal.html} ,{http://www.w3.org/1999/11/11-WWWProposal/vocab.rdf#Document}). {http://purl.org/dc/elements/1.0/Title} ({http://www.w3.org/History/1989/proposal.html} , "Information Management: A Proposal"). {http://www.w3.org/1999/02/22-rdf-syntax-ns#type} ({http://www.w3.org/People/all#timbl%40w3.org} ,{http://www.w3.org/1999/11/11-WWWProposal/vocab.rdf#Person}). {http://purl.org/dc/elements/1.0/Creator} ({http://www.w3.org/History/1989/proposal.html} ,{http://www.w3.org/People/all#timbl%40w3.org}). {http://purl.org/dc/elements/1.0/Description} ({http://www.w3.org/History/1989/proposal.html} , "This proposal concerns the [...etc] "). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Dec 9 22:47:53 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <199912092159.NAA18412@postman.infoseek.com> Message-ID: <3.0.5.32.19991209144458.00c168c0@corp.infoseek.com> At 04:51 PM 12/9/99 -0500, Amanda Xu wrote: >Do you expect the end-user to understand >term weighting techniques as well as the >structure of an XML document? > >Elephant Certainly not. We're lucky if search engine users type two-word queries. wunder >-----Original Message----- >From: Walter Underwood [mailto:wunder@infoseek.com] >Sent: Thursday, December 09, 1999 3:04 PM >To: Dongwook Shin >Cc: 'XML developers' list' >Subject: Re: A processing instruction for robots > > >At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote: >>Walter Underwood wrote: >>> Structural markup opens up some really nice possibilities. >>> An indexer might weight the bibliography less and the >>> abstract more, for example. >> >>If you see XRS (XML retrieval system), you can find that a user >>can give a bigger weight to an element than to another. This >>kind of weighting is more flexible than those by indexer. >>Check XRS Web demonstration system: >>http://dlb2.nlm.nih.gov/~dwshin/xrs.html > >I think you are suggesting that wighting and selection >should be done at query time instead of at index time. >That is a design tradeoff for the search engine. But the >detailed weighting and selection belong *somewhere* in >the search engine rather than in every single document. > >I can imagine a system where each document had indexing >hints scattered throughout the structure, but I can't >imagine anyone having the time or knowledge to do a good >job with all that markup. We have enough trouble getting >people to replace "Untitled Document" in the <title> element >in HTML. > >wunder >-- >Walter R. Underwood >Senior Staff Engineer >Infoseek Software >GO Network, part of The Walt Disney Company >wunder@infoseek.com >http://software.infoseek.com/cce/ (my product) >http://www.best.com/~wunder/ >1-408-543-6946 > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN >981-02-3594-1 >To unsubscribe, mailto:majordomo@ic.ac.uk the following message; >unsubscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following >message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/ http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Thu Dec 9 22:53:44 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:18:28 2004 Subject: XML parser in Javascript for RDF app? (feasible?) In-Reply-To: <Pine.GHP.4.21.9912092200150.29718-100000@mail.ilrt.bris.ac.uk> References: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com> Message-ID: <199912092253.XAA13844@sonne.darmstadt.gmd.de> Dan Brickley <Daniel.Brickley@bristol.ac.uk> wrote at 9 Dec 99, 22:24: > Some time ago I remember a posting about there being an XML parser > (presumable an SMLish subset?) available that was written in > Javascript. Perhaps I hallucinated this! If not, I'd love to know where > this work is up to, whether available opensource etc. http://www.jeremie.com/Dev/XML/index.jer ++im -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Dec 9 23:27:36 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:18:28 2004 Subject: XML parser in Javascript for RDF app? (feasible?) In-Reply-To: <199912092253.XAA13844@sonne.darmstadt.gmd.de> Message-ID: <Pine.GHP.4.21.9912092315530.29718-100000@mail.ilrt.bris.ac.uk> On Thu, 9 Dec 1999, Ingo Macherius wrote: > Dan Brickley <Daniel.Brickley@bristol.ac.uk> wrote at 9 Dec 99, 22:24: > > > Some time ago I remember a posting about there being an XML parser > > (presumable an SMLish subset?) available that was written in > > Javascript. Perhaps I hallucinated this! If not, I'd love to know where > > this work is up to, whether available opensource etc. > > http://www.jeremie.com/Dev/XML/index.jer Aha, thanks. Claims to be fairly complete, and doesn't barf on my sample XML/RDF files. Am now wondering whether the companion XSL engine http://www.jeremie.com/Dev/XSL/ (if updated) would work as a way of transforming (a certain style of) xml serialised graph back into rdfesque queryable structures. I'll have a play around... cheers, Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Thu Dec 9 23:54:39 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:28 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991209144458.00c168c0@corp.infoseek.com> Message-ID: <000301bf42a0$de2301e0$d1940e18@smateo1.sfba.home.com> >Certainly not. We're lucky if search engine users >type two-word queries. Have you tried putting up two input boxes instead of one long one? A bit of GUI trick can sometime to wonders. Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Fri Dec 10 12:36:40 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:18:28 2004 Subject: Answer: XSLT Question: Inserting a DOCTYPE decl References: <93CB64052F94D211BC5D0010A800133101FDE876@wwmess3.bra01.icl.co.uk> Message-ID: <3850F405.7B491F7D@mitre.org> Hi Folks, The solution to the problem that I posed of writing a stylesheet which inserts a DOCTYPE declaration into the input XML document, where the DTD file is found as the text value of an element in the input XML file, is listed below. Thanks to all those who made suggestions, particularly Michael Kay who created the below solution. DoctypeInserter.xsl ------------------------------------------------------------------ <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="doctype"> <xsl:value-of select="//DoctypeFile"/> </xsl:variable> <xsl:variable name="rootnode"> <xsl:value-of select="name(*)"/> </xsl:variable> <xsl:template match="/"> <xsl:variable name="quote">"</xsl:variable> <xsl:value-of disable-output-escaping="yes" select="concat('<!DOCTYPE ', $rootnode, ' SYSTEM ', $quote, $doctype, $quote, '>')"/> <xsl:apply-templates/> </xsl:template> <xsl:template match="*|@*|comment()| processing-instruction()|text()"> <xsl:copy> <xsl:apply-templates select="*|@*|comment()| processing-instruction()|text()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> ------------------------------------------------------------------ As you can see, the trick was not to use xsl:output at all. Instead, the DOCTYPE delcaration must be built "by hand". After the DTD file is found (and stored in doctype) the root node is found (and stored in rootnode). In the template rule for the document, before any of the XML elements are copied, the DOCTYPE declaration is constructed by concatenating together the various components. With this XML document as input: <?xml version="1.0"?> <Numbers> <DoctypeFile>Number.dtd</DoctypeFile> <Number>27</Number> <Number>34</Number> <Number>18</Number> <Number>67</Number> <Number>99</Number> <Number>16</Number> </Numbers> The result of running it through the above stylesheet is: <?xml version="1.0"?> <!DOCTYPE Numbers SYSTEM "Number.dtd"> <Numbers> <DoctypeFile>Number.dtd</DoctypeFile> <Number>27</Number> <Number>34</Number> <Number>18</Number> <Number>67</Number> <Number>99</Number> <Number>16</Number> </Numbers> Which is, of course, exactly what I wanted! Thanks again. /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From allan.kelly at reuters.com Fri Dec 10 13:38:27 1999 From: allan.kelly at reuters.com (Allan Kelly 59048) Date: Mon Jun 7 17:18:28 2004 Subject: nestable C/C++ parser In-Reply-To: <38501ED0.B8B4F9FD@fxtech.com> Message-ID: <E11wQFb-0001Mk-00@romeo.ic.ac.uk> I find this debate quiet interesting and would like to share my experience, maybe this should in the a "object serialisation thread" but here goes.... I've got some code (sorry, company's not mine, can't publish) which started life a generic container, when we came to serialise the container XML was the obvious candidate, because, as has been said before why invent a news format? Anyway, what we've currently got I refer to as XML-like because - I'm not confident of my DTD writing - Each serialisation forms a message, the message is really an XML-fragment so is missing pre-log and post-log - root elements must have an attribute "name" I have a plugable factory class. Each object knows how to stream itself in and out. The input stream is pipped into the factory, as long as the messages/XML-fragments are for classes which have been plugged into the factory everything is fine, the factory produces a container which holds items from the stream and can be accessed using operator[] This works quiet well for passing messages between co-operating processes. The code is actually quiet small and efficient. Which makes me wonder why I need to bother with expat, SAX and DOM? The short answer is I don't because we have a tailored solution to our problem. Allan >> I think your original request got lost in a side track. If is very possible >> to do what you want with expat. The trick is the use of the XML_SetUserData >> and the userdata argument. Basically, the trick is to create a base class >> that has methods that you want to change the behavior of (typically, >> StartElement and EndElement). Create derived classes for each different >> behavior that you want. Call XML_SetUserData to the initial handler object. >> In your StartElement callback, cast the userdata argument up to a pointer to >> your base class and call its startElement virtual method. If you want to >> change the handler, make another call to XML_SetUserData. > >The problem here is it requires 3 callbacks to parse a single element >and restore the state. I'd like a cleaner solution. > >-- >Paul Miller - stele@fxtech.com > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To unsubscribe, mailto:majordomo@ic.ac.uk the following message; >unsubscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) ----------------------------------------------------------------- Visit our Internet site at http://www.reuters.com Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Fri Dec 10 14:39:29 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:28 2004 Subject: Appending to an XML document In-Reply-To: Your message of "Wed, 08 Dec 1999 00:40:54 +0100." <199912072337.AAA10151@sonne.darmstadt.gmd.de> Message-ID: <199912101439.HAA03296@localhost.localdomain> > currently I'm busy designing an XML based log format myself. In > contrast to "classic line based logging", appending indeed is > prohibitively costly in XML. Thus I decided not to log into a > wellformed XML document, but to stick with a sequence of <Event> type > doc-fragments, just being well-formed per event. > Of course one can not parse the result immediately, but at the time > of log analysis (or whatever you do with your event data), it's > trivial to pre- and append the necessary tags to enclose the doc- > fragments. > > XML was just not designed to fit the demands of concatenatiation. But > I found the value of structuring single events in a "semi-structured" > (read: well-formed) way valuable enough to choose XML. The "missing > enclosing tag" is not really a serious problem if you delay its > insertation until REALLY necessary. I don't really see this as a problem with XML. Why must you consider your log a well-formed XML document? If you instead treat it as a well-formed XML external parsed entity, then you are freed of the append problem, while being fully XML compliant. And, of course, you already hit upon this solution yourself, by enclosing your log in a simple wrapper to create an XML document for processing. Despite the recent controversy about EPEs, most XML tools should support them, and so you shouldn't even be too constrained in your XML tool-set. But again, I don't see a problem with XML here. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LWatanab at JetForm.com Fri Dec 10 15:19:26 1999 From: LWatanab at JetForm.com (Larry Watanabe) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document Message-ID: <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> The decision to make an XML document contain just a single element is the root of the problem. Perhaps the specification for future versions could accept multiple elements in a single document. This would make concatenation simple and has worked well in the lisp world which operates on similar structures (s-expressions) both for representation of program and data. -Larry Watanabe > -----Original Message----- > From: uche.ogbuji@fourthought.com [SMTP:uche.ogbuji@fourthought.com] > Sent: Friday, December 10, 1999 9:39 AM > To: Ingo Macherius > Cc: Ross Bleakney; xml-dev@ic.ac.uk > Subject: Re: Appending to an XML document > > > currently I'm busy designing an XML based log format myself. In > > contrast to "classic line based logging", appending indeed is > > prohibitively costly in XML. Thus I decided not to log into a > > wellformed XML document, but to stick with a sequence of <Event> type > > doc-fragments, just being well-formed per event. > > Of course one can not parse the result immediately, but at the time > > of log analysis (or whatever you do with your event data), it's > > trivial to pre- and append the necessary tags to enclose the doc- > > fragments. > > > > XML was just not designed to fit the demands of concatenatiation. But > > I found the value of structuring single events in a "semi-structured" > > (read: well-formed) way valuable enough to choose XML. The "missing > > enclosing tag" is not really a serious problem if you delay its > > insertation until REALLY necessary. > > I don't really see this as a problem with XML. Why must you consider your > log > a well-formed XML document? If you instead treat it as a well-formed XML > external parsed entity, then you are freed of the append problem, while > being > fully XML compliant. And, of course, you already hit upon this solution > yourself, by enclosing your log in a simple wrapper to create an XML > document > for processing. Despite the recent controversy about EPEs, most XML tools > > should support them, and so you shouldn't even be too constrained in your > XML > tool-set. > > But again, I don't see a problem with XML here. > > -- > Uche Ogbuji > FourThought LLC, IT Consultants > uche.ogbuji@fourthought.com (970)481-0805 > Software engineering, project management, Intranets and Extranets > http://FourThought.com http://OpenTechnology.org > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Fri Dec 10 15:41:10 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: Your message of "Fri, 10 Dec 1999 10:16:19 EST." <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> Message-ID: <199912101541.IAA03486@localhost.localdomain> > The decision to make an XML document contain just a single element is the > root of the problem. Perhaps the specification for future versions could > accept multiple elements in a single document. This would make concatenation > simple and has worked well in the lisp world which operates on similar > structures (s-expressions) both for representation of program and data. While this may be the root of other problems, and I do not claim to vouch for all such problems, it is _not_ the root of the particular problem in question. I see no reason why the log file must be a well-formed XML document. Can you tell me what is wrong with just treating it as an external parsed entity? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jurg.Wullschleger at mb.luth.se Fri Dec 10 15:44:10 1999 From: Jurg.Wullschleger at mb.luth.se (Jurg Wullschleger) Date: Mon Jun 7 17:18:29 2004 Subject: a simpler document type definition language? Message-ID: <1D1624C992C5D111B67C00A0C99695012B3737@mailserv.mb.luth.se> hi everybody. i like the idea of SML. but i think it is not of so big importance for "normal" programmers: if they don't like attributes, they just don't use it. but a really important thing to every user of XML is how to specify your fileformat. both, DTD and Schemas open you a lots of possibilities to specify your fileformat. but they are quite complicated. and it's not easy to write a program that validates a xml document (i think). so, what do you guys think about a simplified document type definition language? the simplest form i can think of would look something like this: (examples in DTD syntax) there are only 4 types of elements: - empty elements <!ELEMENT name1 EMTPY > - elements that contain data <!ELEMENT name2 (#PCDATA) > - list elements <!ELEMENT name3 (name1|name2|name3|name4)* > - structural elements of a fixed length <!ELEMENT name4 ((name1|name2),name3,name4,(name5|name6|name7)) > maybe that's a bit too restrictive, but i think it is useful for a lot of applications. and it is really easy to "validate" a document. If the user only uses these constructs, he can be sure that the format can easily be handled by a program. i defined a simple document definition language, based one the 4 basic element types. and wrote a small xml editor that can edit xml files which are defined in this language. at the moment, there are two formats defined: one for the rules themseves, with an DTD export filter, and one for a subset of the functionality of CSS, with a CSS export filter. download the source at http://www.netmen.ch/wullschleger/xml/Simple.zip ! And let me know what you think. Thanks. Juerg Wullschleger email: jurg@mb.luth.se xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 10 15:50:23 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> Message-ID: <000e01bf4326$639cc560$d1940e18@smateo1.sfba.home.com> IMHO, this is a parser implementation problem. I do not know of a single XML parser that expects more than one XML document in a file or a stream input. Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rgl at decisionsoft.com Fri Dec 10 15:54:10 1999 From: rgl at decisionsoft.com (Richard Lanyon) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <199912101541.IAA03486@localhost.localdomain> Message-ID: <Pine.LNX.4.10.9912101555430.10391-100000@localhost.localdomain> > > The decision to make an XML document contain just a single element is the > > root of the problem. Perhaps the specification for future versions could > > accept multiple elements in a single document. I think the idea is that the single element /is/ the XML document. -- Richard Lanyon (Software Engineer) | "The medium is the message" XML Script development, | - Marshall McLuhan DecisionSoft Ltd. | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LWatanab at JetForm.com Fri Dec 10 16:22:37 1999 From: LWatanab at JetForm.com (Larry Watanabe) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document Message-ID: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2> If a log file can't easily be represented as a well-formed XML document, then that does indicate a problem with the spec. The XML spec does say that it should be straigthforwardly usable; I don't think external parsed entities are a straightforward way of doing a simple operation such as append. For someone who has to write a log file now with the current spec, either of the proposed solutions should be fine (not maintaining the logfile as a well-formed XML document or using external parsd entities). -Larry Watanabe > -----Original Message----- > From: uche.ogbuji@fourthought.com [SMTP:uche.ogbuji@fourthought.com] > Sent: Friday, December 10, 1999 10:41 AM > To: Larry Watanabe > Cc: 'uche.ogbuji@fourthought.com'; Ingo Macherius; Ross Bleakney; > xml-dev@ic.ac.uk > Subject: Re: Appending to an XML document > > > The decision to make an XML document contain just a single element is > the > > root of the problem. Perhaps the specification for future versions could > > accept multiple elements in a single document. This would make > concatenation > > simple and has worked well in the lisp world which operates on similar > > structures (s-expressions) both for representation of program and data. > > While this may be the root of other problems, and I do not claim to vouch > for > all such problems, it is _not_ the root of the particular problem in > question. > I see no reason why the log file must be a well-formed XML document. Can > you > tell me what is wrong with just treating it as an external parsed entity? > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Fri Dec 10 16:26:55 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <199912101541.IAA03486@localhost.localdomain> References: Your message of "Fri, 10 Dec 1999 10:16:19 EST." <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> Message-ID: <199912101626.RAA05881@sonne.darmstadt.gmd.de> Uche, using entities of any kind does not change the underlying data model. It's syntactic shugar, nothing more. The root of the problem (and a great help in other cases) is indeed the fact that any XML 1.0 document must have a single root. There are several fields where this is assumend a problem, just think e.g. of the return values of XPath or XQL expression, which rarely are single-rooted. Think how often you have heared the term "virtual root node" recently. Think of Murata's "forest automata". Forest, not tree. Guess why. However, plainly dropping the tree structure of XML is too hasty. The main problem here is the fact that there is not much between a tree and a general DAG (graph). Oodles of folks are in search for a convenient data model just in the middle of tree and graph. This would help to truly merge hyperlinks, RDF and XML. Check http://www.w3.org/TR/schema-arch Thus: entities are truely no soulution. However, so far nobody succeded in suggesting a "true" non-tree data model for XML. ++im uche.ogbuji@fourthought.com <uche.ogbuji@fourthought.com> wrote at 10 Dec 99, 8:41: > > The decision to make an XML document contain just a single element is the > > root of the problem. > > While this may be the root of other problems, and I do not claim to vouch for > all such problems, it is _not_ the root of the particular problem in question. > I see no reason why the log file must be a well-formed XML document. Can you -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From yiminz at timberline.com Fri Dec 10 16:39:44 1999 From: yiminz at timberline.com (yimin zhu) Date: Mon Jun 7 17:18:29 2004 Subject: BizTalk Mapper Message-ID: <2D722CFF0999D111AB860001FA375F1004353D2D@laposte.timberline.com> Does anyone know whether the BizTalk Mapper is available now? Yimin Zhu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Dec 10 16:49:28 1999 From: tbray at textuality.com (tbray@textuality.com) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document References: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2> Message-ID: <0dcd01bf432e$bcb999e0$0500a8c0@ned> From: Larry Watanabe <LWatanab@JetForm.com> > If a log file can't easily be represented as a well-formed XML document, > then that does indicate a problem with the spec. The XML spec does say that > it should be straigthforwardly usable; I don't think external parsed > entities are a straightforward way of doing a simple operation such as > append. Seems to me the obvious solution is that for streaming apps like logfiles, the smart thing to do is to represent them as a sequence of small XML docs. That way, you can also load each one into memory, validate it, do all sorts of clever things you wouldn't be able to do if you were trying to pretend the stream was a single document. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Fri Dec 10 16:57:15 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: Your message of "Fri, 10 Dec 1999 17:29:40 +0100." <199912101626.RAA05881@sonne.darmstadt.gmd.de> Message-ID: <199912101657.JAA03681@localhost.localdomain> > using entities of any kind does not change the underlying data model. > It's syntactic shugar, nothing more. The root of the problem (and a > great help in other cases) is indeed the fact that any XML 1.0 > document must have a single root. >From what I know of your problem, it seems as if you are the one who is confusing implementation issues with the underlying data model. If I were faced with the same problem, my solution would be very simple. The schema (your "underlying data model") for my XML logging document would be as follows: <!ELEMENT log (entry*)> <!ELEMENT entry (#PCDATA)> My low-level logging code (where efficiency counts more than schematics) would manage a disk file in the form <entry>Nam Sybillam quidem Cumis ego oculis meis vidi in ampulla pendere</entry> <entry>Pueris respondebat "Volo perire"</entry> And appending is as efficient as you please. Let us say this disk file was "/var/log/classic.log" The rest of the world (which is expecting an XML: document) would access the logs through the following <?xml version="1.0"> <!DOCTYPE log [<!ENTITY lf SYSTEM "file:/var/log/classic.log">]> <log>&lf;</log> And ta-da! We've satisfied both our efficiency and semantic concerns using XML 1.0. So where is the problem? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Fri Dec 10 17:25:28 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:18:29 2004 Subject: Oops! Attribution error In-Reply-To: Your message of "Fri, 10 Dec 1999 09:57:00 MST." <199912101657.JAA03681@localhost.localdomain> Message-ID: <199912101725.KAA03768@localhost.localdomain> > > using entities of any kind does not change the underlying data model. > > It's syntactic shugar, nothing more. The root of the problem (and a > > great help in other cases) is indeed the fact that any XML 1.0 > > document must have a single root. > > >From what I know of your problem, it seems as if you are the one who is > confusing implementation issues with the underlying data model. This last paragraph was not meant to be a quote of Larry but part of my response. Looks as if an > nipped in there somehow. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From abisheks at india.hp.com Fri Dec 10 17:32:54 1999 From: abisheks at india.hp.com (Abhishek Srivastava) Date: Mon Jun 7 17:18:29 2004 Subject: Parsing a DTD for information Message-ID: <002301bf4334$7144adf0$252f0a0f@india.hp.com> Hi, Is there an XML parser that will allow me to parse just a DTD. Suppose the following is my DTD <!ELEMENT (name+,lastname+)> My application needs to know that it can have a list of names and a list of lastnames. Most parsers give me the data inside the elements/attributes . .. however, do not allow to access the grammar associated with the elements/attributes in the DTD. regards, Abhishek. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ _/ Abhishek Srivastava _/ Hewlett Packard ISO _/_/_/ _/_/_/ ------------------- _/ / _/ _/ (Work) +91-80-2251554 x1190 _/ _/ _/_/_/ (Ip) 15.10.47.37 _/ (Url) http://sites.netscape.net/abhishes/index.html _/ Work like you don't need the money. Dance like no one is watching. And love like you've never been hurt. --Mark Twain ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991210/ddcb1a12/attachment.htm From lisarein at finetuning.com Fri Dec 10 17:34:17 1999 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:18:29 2004 Subject: BizTalk Mapper References: <2D722CFF0999D111AB860001FA375F1004353D2D@laposte.timberline.com> Message-ID: <38513A39.A3D9E2E1@finetuning.com> Nope! The BizTalk schema mapper server thinggy won't be out till MAYBE second quarter 2000 (per microsoft evangelists) lisa rein http://www.finetuning.com/collect.html yimin zhu wrote: > > Does anyone know whether the BizTalk Mapper is available now? > > Yimin Zhu > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From klamerus at pobox.com Fri Dec 10 18:16:32 1999 From: klamerus at pobox.com (Mark & Eileen Klamerus) Date: Mon Jun 7 17:18:29 2004 Subject: Lists of Schema Message-ID: <000001bf433a$928ad740$5a67fea9@hydrox> All, I'm in a research for various XML schema initiatives. In particular those which would be applicable to the chemicals industry. I know that most schema are oriented toward work processes (customer data, billing information, etc.), but even with those it's hard to find a good reference list. Are there any sites or references which identify schema? Are there any organizations (besides OASIS) which might provide information on initiatives for define schema underway? Thanks, especially for e-mail. Mark Klamerus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 10 18:22:37 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <000e01bf4326$639cc560$d1940e18@smateo1.sfba.home.com> Message-ID: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com> On Fri, 10 Dec 1999, Don Park wrote: > IMHO, this is a parser implementation problem. > I do not know of a single XML parser that expects > more than one XML document in a file or a stream > input. I tend to agree here. If a DOM parser encounters more than one root element, it could easily create a root element, say by grabbing the name of the file. If a SAX parser encounters more than one root element, it should just proceed by ending the first 'root' element, and then starting the next one. The only alternative is to have your log file open a root element and then never terminate it -- I think the parser should handle this as well. Why would this be a problem? Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 10 18:25:59 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <199912101657.JAA03681@localhost.localdomain> Message-ID: <Pine.LNX.4.10.9912100126250.14597-100000@cauchy.clarkevans.com> On Fri, 10 Dec 1999 uche.ogbuji@fourthought.com wrote: > And ta-da! We've satisfied both our efficiency and semantic > concerns using XML 1.0. So where is the problem? Great solution! Now let's just implement this at the parser level so that the average user doesn't need to be concerned with this tedious detail. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 10 18:32:21 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com> Message-ID: <000a01bf433c$fe78eda0$d1940e18@smateo1.sfba.home.com> >Why would this be a problem? No problem as far as I can see. But then I am kind-a short. Seriously, this is a meme-effect. The 'Document' meme is so strong that I suspect people actually visualize the color and the texture of paper when they read the word 'document'. Not so seriously, note that 'meme-effect' is different from 'mama-effect' which is like the effect W3C has on the XML community. <g> Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Dec 10 19:20:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: "Clark C. Evans"'s message of "Fri, 10 Dec 1999 01:24:43 -0500 (EST)" References: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com> Message-ID: <m3iu26mwmb.fsf@localhost.localdomain> "Clark C. Evans" <clark.evans@manhattanproject.com> writes: > On Fri, 10 Dec 1999, Don Park wrote: > > IMHO, this is a parser implementation problem. I do not know of a > > single XML parser that expects more than one XML document in a > > file or a stream input. > > I tend to agree here. If a DOM parser encounters more than one root > element, it could easily create a root element, say by grabbing the > name of the file. If a SAX parser encounters more than one root > element, it should just proceed by ending the first 'root' element, > and then starting the next one. No, these would both be non-conformant -- the XML spec defines a document as the main production, and a parser that encounters a second root element in what is being given to it as a document simply has to stop processing, except for error reporting. You have to distinguish the document boundaries in a single stream before you pass it on to the parser. For example, you could use ^L as the document separator, and start a new parse each time you see it. All the best, DAvid -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Dec 10 20:01:10 1999 From: clark.evans at manhattanproject.com (Clark C. Evans) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <m3iu26mwmb.fsf@localhost.localdomain> Message-ID: <Pine.LNX.4.10.9912100301400.14862-100000@cauchy.clarkevans.com> On 10 Dec 1999, David Megginson wrote: > No, these would both be non-conformant -- the XML spec defines a > document as the main production, and a parser that encounters a second > root element in what is being given to it as a document simply has to > stop processing, except for error reporting. Yes. But I thought the computational model for XML was a hedge-automata? (not a tree-automata... ) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 10 20:47:58 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:29 2004 Subject: DOM Level 2 bumped up to Candidate Recommendation Message-ID: <000001bf434f$ef27bee0$d1940e18@smateo1.sfba.home.com> DOM Level 2 bumped up to Candidate Recommendation http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/ A lot of work ahead for us imps at the bleeding edge. Don't forget to send back the bloody bandages to: http://lists.w3.org/Archives/Public/www-dom/ Cheers, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Fri Dec 10 21:10:14 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:18:29 2004 Subject: DOM Level 2 bumped up to Candidate Recommendation In-Reply-To: <000001bf434f$ef27bee0$d1940e18@smateo1.sfba.home.com> Message-ID: <199912102106.NAA06308@mail.sqwest.bc.ca> On 10 Dec 99, at 12:48, Don Park wrote: > DOM Level 2 bumped up to Candidate Recommendation > > http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/ > > A lot of work ahead for us imps at the bleeding edge. > Don't forget to send back the bloody bandages to: > > http://lists.w3.org/Archives/Public/www-dom/ Yes, please do. Candidate Recommendation is a new phase for W3C specs; the idea is to see if it's implementable before sending it off to PR. The only changes that will be made from now on are if something is seriously broken and it's very difficult or impossible to implement. (Apart from clarifications, of course!). So if the spec isn't clear enough to implement from, or implementations are nearly impossible, please send email. Also, if you do implement some part of Level 2, I'd like to hear about it, so we can be sure that the spec has been implemented often enough that we probably have the bugs out of it. If you want your email to be confidential, just send it to me (lauren@softquad.com), marking it as confidential, otherwise please send email to the public DOM mailing list. thanks, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From orchard at pacificspirit.com Fri Dec 10 22:57:25 1999 From: orchard at pacificspirit.com (David Orchard) Date: Mon Jun 7 17:18:29 2004 Subject: Object-oriented serialization (Was Re: Some questions) In-Reply-To: <33D189919E89D311814C00805F1991F7F4A9A5@RED-MSG-08> Message-ID: <001601bf4361$e8bd1de0$63511c09@n54wntw.vancouver.can.ibm.com> I believe there a 3rd way, that is: Between system wide (meta-grammar) and mapping rules associated with a schema, there is the third option of a graph-specific set of rules for associating schemas. The graph specification language allows arbitrary graphs and mapping rules for subgraphs of the universe of elements described a schema. Thus I could take the same large graph of Java objects and serialize different subgraphs to different XML documents. The XML documents follow any pattern , but the mapping rules are not necessarily 1:1. In another case, I could take 2 distinct Java graphs and serialize to the same XML schema with different "denormalization" in mapping or transforming between the lhs and rhs. There can be many mappings or bindings for arbitrary graph traversals, potentially selected at runtime. I personally am very interested in graph grammars and would love to hear about papers on the topic. A sample that I am interested is a graph grammar that can be used to specify a graph to traverse from a given starting point. Typical examples of this are a graph of COM+ or EJB objects to instantiate for a request, a graph of XLink extended links to retrieve for a given document, or a graph of XInclude elements to traverse. Cheers, Dave Orchard XLink co-editor > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Andrew Layman > Sent: Monday, December 06, 1999 12:09 PM > To: XML Developers List > Subject: RE: Object-oriented serialization (Was Re: Some questions) > > > Thanks. As a recap: There are, broadly, two approaches to serializing a > graph in XML. > > One is to invent a meta-grammar, a set of canonicalization rules. That is > what RDF syntax did, and what the attribute-centric and element-centric > canonical format papers do, what SOAP section eight does. I think > of this as > "tunnelling the graph through XML." > > The other is to allow XML documents to follow any pattern described in a > schema, and augmenting the schema with a set of mapping rules. > > There appears to be significant value to each approach. (In particular, > however, I disagree with the sometimes-asserted claim that graphs capture > the semantics of a communication while grammars do not. Graphs are just > another grammar. This makes me reluctant to deprecate grammars.) > > I agree that formal approaches to mapping would be helpful. I look forward > to reading your papers. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Fri Dec 10 22:57:28 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2> Message-ID: <3.0.5.32.19991210145224.00c08100@corp.infoseek.com> At 11:09 AM 12/10/99 -0500, Larry Watanabe wrote: > >If a log file can't easily be represented as a well-formed XML document, >then that does indicate a problem with the spec. Well, time series data (a logfile) isn't normalizable in the relational model, so there's a problem with the SQL spec, too. And I'm having a lot of trouble accessing my meatloaf with my SCSI bus, so let's fix that while we're at it. Seriously, I think it is good that XML documents have a definate end. But the logfile problem can be solved in two (near-)standard ways. 1) Treat the logfile as an XML Fragment (see proposal at W3C). An XML Fragment needs to be "well-balanced", but doesn't have to have a single root element. 2) Treat the logfile as a series of XML documents, each of which is a log record. They could be separated by formfeed (a character illegal inside an XML document). Since the the XML declaration is technically optional, it could be omitted. But nobody every made representing logfiles a requirement for a markup language. It is usually far more important that logfiles are compact and are readable if the program crashes (or the disk fills) when a record is partly-written. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Fri Dec 10 23:48:38 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document In-Reply-To: <3.0.5.32.19991210145224.00c08100@corp.infoseek.com> Message-ID: <000001bf4369$3186ad00$d1940e18@smateo1.sfba.home.com> Here is another solution that most people overlook: You can use XML APIs without using XML and still get most of the benefits. Store your XML in any format you want but write a SAX parser for it so that log processing applications can be XML applications. This allows you to move up to XML storage format when and if you need to without rewriting any software. Log producers work in reverse, meaning that they fire SAX events and a special SAX application write out the information into a file in custom format or send it to a log server using whatever wire format you want. You can do something similar with DOM as well. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Sat Dec 11 00:08:04 1999 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:18:29 2004 Subject: Object-oriented serialization (Was Re: Some questions) Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17056@martinique> Sorry I'm a bit late with this... > -----Original Message----- > From: James Tauber [mailto:jtauber@jtauber.com] > Sent: Sunday, December 05, 1999 7:25 PM > To: xml-dev@ic.ac.uk > > > The semantic constraints I am > > talking about are one step away from these "ultimate" > semantics; they > > tell you that an integer contained in a given element > cannot be greater > > than 100, but they don't tell you why. These are still > semantics to me > > Ah. This is why I have have some difficulty understanding > some of what you > are saying. To me, the constraint that an integer cannot be > greater than 100 > is not semantics. It's syntax. > > MyInteger ::= ( '100' | digit{1,2} | '-' digit+ ) > > or in some more perspicuous grammar: > > MyInteger = Integer x : x <= 100 > But this won't work outside of Europe. You have to have a clean distinction between syntax and semantics, and an explicit, rigorous mapping from one to the other. The symbols used in your example have no intrinsic meaning, just as numbers have no intrinsic form available to our perception, so the syntax can only constraint the formal properties of expressions using those symbols. Sure, we have conventional meanings for constant symbols like '1' and '0'; but they're still symbols pointing to something else, and as soon as you start writing expressions with a different symbol set - in Sanskrit, say, or Ethiopic, or you name it - then you're out of luck without a formal semantics. On the other hand, the cultural interpretation of the denotatum is beyond the scope of language definition. Doesn't matter what the user intends to model using integers; be it widgets, fingers, or planets, the best the language designer can do is provide a consistent language that accurately models the integers. -gregg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Min.Zhong at penske.com Sat Dec 11 04:35:11 1999 From: Min.Zhong at penske.com (Min Zhong) Date: Mon Jun 7 17:18:29 2004 Subject: BizTalk JumpStart Kit - File missing Message-ID: <s8518e1a.055@penske.com> Hi, Is there anyone tryed to install BizTalk JumpStart Kit before? I tried to install it but failed. I had to debug the setup script and finally found "mtsPkgMgs.dll" is missing. Could someone help me to find out where I can download that file? Thank you very much! Min xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Dec 11 08:26:01 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:29 2004 Subject: Appending to an XML document Message-ID: <005801bf43b4$d99db970$06f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson <david@megginson.com> >You have to distinguish the document boundaries in a single stream >before you pass it on to the parser. For example, you could use ^L as >the document separator, and start a new parse each time you see it. This is at least the third time this has come up (from memory, it was discussed during the XML development, then a year ago on XML-DEV). So it would be good to make a QnA for the SGML FAQ on this subject. Can anyone suggest an improvement on the following? ----- Q. How can I have an unending stream of data in XML? A. You must use a stream of XML documents. The simplest way to do this is separate each document with ^L, which is not an allowed character in XML and which is not used for in-band signalling by common streaming systems. If the incoming stream terminates unexpectedly during a document, then that document is not well-formed. You should consider how to handle such fragments. Note that "document" is a technical term meaning a "collection of information that is processed as a unit" (ISO 8879:1986) and represents a distinct layer between storage/transport (e.g., entities, streams, archives) and publication. An open-ended stream must be partitioned into distinct XML documents, for example, one per entry. Consequently, you cannot use ID/IDREF for references between documents in a stream, but rather you should use some more general reference mechanism, such as W3C XPointers. Another alternative, suggested by Uche Ogbuji, is suitable when the incoming log data is to be sent to a file rather than processed: The schema (your "underlying data model") for my XML logging document would be as follows: <!ELEMENT log (entry*)> <!ELEMENT entry (#PCDATA)> My low-level logging code (where efficiency counts more than schematics) would manage a disk file in the form <entry>Nam Sybillam quidem Cumis ego oculis meis vidi in ampulla pendere</entry> <entry>Pueris respondebat "Volo perire"</entry> And appending is as efficient as you please. Let us say this disk file was "/var/log/classic.log" The rest of the world (which is expecting an XML: document) would access the logs through the following <?xml version="1.0"> <!DOCTYPE log [<!ENTITY lf SYSTEM "file:/var/log/classic.log">]> <log>&lf;</log> And ta-da! We've satisfied both our efficiency and semantic concerns using XML 1.0. ------ Why is this not in the XML Spec? 1) Simplicity and layering 2) It is not the W3C's business to make specs for streams of entities: IETF is the forum for that. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ceo at citix.com Sat Dec 11 12:24:58 1999 From: ceo at citix.com (Steven Livingstone) Date: Mon Jun 7 17:18:29 2004 Subject: BizTalk JumpStart Kit - File missing References: <s8518e1a.055@penske.com> Message-ID: <003501bf43d3$3bda35a0$0a0a0a0a@deltabiz> Min, I have the BizTalk kit installed and that file is not on my disk !? What setup file tries to install this ? cheers steven Steven Livingstone Glasgow, Scotland. +44 (0) 7771 957 280 Professional Site Server 3 http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696 Professional Site Server 3.0 Commerce Edition http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505 ----- Original Message ----- From: Min Zhong <Min.Zhong@penske.com> To: <xml-dev@ic.ac.uk> Sent: Friday, December 10, 1999 10:51 PM Subject: BizTalk JumpStart Kit - File missing Hi, Is there anyone tryed to install BizTalk JumpStart Kit before? I tried to install it but failed. I had to debug the setup script and finally found "mtsPkgMgs.dll" is missing. Could someone help me to find out where I can download that file? Thank you very much! Min xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Sat Dec 11 13:41:36 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:30 2004 Subject: SAX and <!DOCTYPE> In-Reply-To: <whk8mort34.fsf@viffer.oslo.metis.no> References: <whln77f8hq.fsf@viffer.oslo.metis.no> <m3r9gzdqx0.fsf@ifi.uio.no> <whk8mort34.fsf@viffer.oslo.metis.no> Message-ID: <m3emctd27e.fsf@ifi.uio.no> * Lars Marius Garshol | | In the XSA client, which needs to accept both XSA and OSD documents, | but can't tell them apart before parsing begins, uses a | DispatchingDocHandler, which has a hash of DocumentHandlers keyed on | the name of the document element. In this very restricted case that | worked just fine. * Steinar Bang | | [...] | It looks like a dispatching from a DocumentHandler is the best idea, | but then I need to be able to queue up SAX DocumentHandler events to | send to the actual DocumentHandler when I start it. If you dispatch on the document element this is easy, since the only events you can get before it (in SAX 1.0, that is) are PI events. In my handler I simply stuffed those into a Vector and replayed them when the correct DocumentHandler had been selected. | Hm... maybe an clone() function an a virtual destructor are in order | for the C++ AttributeList class...? There is an equivalent to a cloning function in the AttributeListImpl class already: <URL: http://www.megginson.com/SAX/javadoc/org.xml.sax.helpers.AttributeListImpl.html > --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Sat Dec 11 14:14:14 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:30 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com> References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com> Message-ID: <m3d7sdd0oz.fsf@ifi.uio.no> * Lars Marius Garshol | | First thought: this is fine for very simple uses, but for more | complex uses something along the lines of the robots.txt file would | be very nice. How about a variant PI that can point to a robots.rdf | resource? * Walter Underwood | | In our experience, the simple form covers almost all needs. We have | 1000+ customers, and only three or four of them use our selective | indexing support. So, I think of the robots meta tag as a proven | solution that doesn't need major improvement. I agree with you that probably the majority of web authoring individuals prefer and are happy with the "meta tag" solution, however, lots of people (such as me, for example) are not going to be happy with it, since it requires indexing information to be added to each and every document. My gut reaction to that is that it's plain wrong, because it leads to so much hassle in content maintenance. Also, using an RDF file to describe the site structure opens up for new possibilities such as being able to group resources in a sensible way to enable search engines to respond with more meaningful search results. Ever since RDF appeared I've been waiting for some application that would enable me to say: - all these resources are small pieces of this larger split-up resource, which is represented by _this_ resource - this group of resources belongs together, and they are represented by _this_ resource - this group contains this other group - this are the groups of resources that make up this site, and this is the home page of the site - these groups are authored by this person, who is represented by this resource - this resource is of this kind In an ideal world, this would lead to search engine responses like the following: http://www.infotek.no/foredrag/lmg-xml.no-99/slide34.html xml-dev, part of a slide presentation by Lars Marius Garshol. Part of the STEP Infotek web pages. [top slide] [site top page] [author] This doesn't really seem all that hard, but optimist that I am I may of course be seriously underestimating the difficulties involved. | Secondly, fetching two or more entities for one document makes the | robot code much more complex. If the robots.rdf file gets a 404, | what happens? What about a 401 or a timeout? The robot may need | separate last-modified dates and revisit times for each entity. And | after it is implemented and tested, how do you explain all that to | customers who just want search results? Personally, if I were a search engine vendor, I would see this is a great chance to really stand out from the competition and deliver something beyond what the others do, at least until they catch on. Yes, it requires more from the users, yes, it requires more from the implementation, but this has to be weighed against the benefits, which are presumably large. Also, seeing the amount of interest for "meta tags" and optimizing for various search engines among various content providers I assume that if this facility really did help providers get more hits for their sites then that would be all the motivation they need. But in any case this was only meant as a loose suggestion, so if you're not interested, then that's the end of that. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Sat Dec 11 14:28:01 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:30 2004 Subject: A processing instruction for robots In-Reply-To: <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com> References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com> Message-ID: <m3bt7xd021.fsf@ifi.uio.no> * Lars Marius Garshol | | Second thought: "and the index attribute must be first". This is | nice for implementors, but is likely to clash with the expectations | of users and the cost of more generality is very low for | implementors. * Walter Underwood | | I'm open to changing this, but I thought I would start with the most | strict version. The advantage of the strict version is that it | doesn't need to be parsed. The Desparate Perl Hacker can do four | regex compares for the four variants and get back to work. I think I agree with Robert Hanson here. It's so easy to parse even if the order is optional that I don't really see any point in fixing the order. (Note for example that "Associating stylesheets with ..." does not fix the order.) | Maybe folks who've worked with authors on SGML systems have | some relevant experience. Is this too strict for folks that | aren't tamed by computers? I'm not really qualified to answer this, so I'll pass. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at garshol.priv.no Sat Dec 11 14:46:31 1999 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Mon Jun 7 17:18:30 2004 Subject: RSS and WAP - another RSS question In-Reply-To: <384018F6.182C7EFC@finetuning.com> References: <3.0.6.32.19991127171313.00920c80@gpo.iol.ie> <384018F6.182C7EFC@finetuning.com> Message-ID: <m366y5cz76.fsf@ifi.uio.no> * Lisa Rein | | I'm having trouble understanding why or if RSS even matters For me that's very simple. I like to keep track of news from lots of different web sites, but I _really_ dislike having to keep revisiting each site all the time to see if there is anything new. (Especially with sites that are rarely updated, or dog-slow, like slashdot.) The nice thing about RSS is that for the sites that support it it completely removes that need. Instead I can just tell my RSS viewer to get new stories when I feel like an update and it will list the new stories nicely with the source etc in the client window. Then I remove the ones that don't seem interesting and make the client open the interesting ones in my browser. In fact, this is so nice that I notice that I tend to just skip the non-RSS sites, and I also note that it allows me to keep track of more sites than the old model. The next feature for the client is probably going to be filters, so that I don't have to see stuff that I know a priori won't be of interest. | (despite its apparent level of widespread adoption) and Well, that's IMHO the other reason why it matters. It's a first example of an application like the global XML applications that were envisioned when this whole XML thing started that has succeeded in terms of adoption. Personally I think that is very important. | am trying to determine its level of inclusion in my books/classes. I use it extensively in talks and classes, and am finding that it seems to go down well, probably because: - anyone can understand what it's about - the documents are so simple - when presented in the right way the utility becomes obvious --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tlainevool at yahoo.com Sat Dec 11 22:57:04 1999 From: tlainevool at yahoo.com (Toivo Lainevool) Date: Mon Jun 7 17:18:30 2004 Subject: XML Design Patterns Message-ID: <19991211225649.24570.qmail@web2103.mail.yahoo.com> --- Don Park <donpark@docuverse.com> wrote: > These are preliminary XML design patterns > so pattern names are weird to say the least. Does anyone have any references to XML design patterns, seems like a great idea. Thanks, Toivo Lainevool __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Veillard at w3.org Sun Dec 12 00:10:24 1999 From: Daniel.Veillard at w3.org (Daniel Veillard) Date: Mon Jun 7 17:18:30 2004 Subject: XPointer Working Draft entered Last Call Message-ID: <19991211191014.D23680@w3.org> Since nobody seems to have posted about it yet, I forward this information here, people should review (and start implementing if interested) this Last Call draft and report problems or implementations to the public list www-xml-linking-comments@w3.org http://lists.w3.org/Archives/Public/www-xml-linking-comments/ thanks, ------------- Excerpts --------------- Status of this document The XML Linking Working Group, with this 1999 December 6 XPointer Last Call working draft, invites comment on this specification. The Last Call period begins 6 December 1999 and ends 27 December 1999. The W3C Membership and other interested parties are invited to review the specification and report implementation experience. Please send comments to www-xml-linking-comments@w3.org [...] Abstract This specification defines the XML Pointer Language (XPointer), the language to be used as a fragment identifier for any URI-reference that locates a resource of Internet media type text/xml or application/xml. XPointer, which is based on the XML Path Language (XPath), supports addressing into the internal structures of XML documents. It allows for traversals of a document tree and choice of its internal parts based on various properties, such as element types, attribute values, character content, and relative position. --------------------------------------- Daniel W3C Staff contact for the XML Linking WG -- Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tlainevool at yahoo.com Sun Dec 12 01:40:04 1999 From: tlainevool at yahoo.com (Toivo Lainevool) Date: Mon Jun 7 17:18:30 2004 Subject: a simpler document type definition language? Message-ID: <19991212014000.25736.qmail@web2101.mail.yahoo.com> --- Jurg Wullschleger <Jurg.Wullschleger@mb.luth.se> wrote: > the simplest form i can think of would look something like this: (examples > in DTD syntax) > there are only 4 types of elements: > > - empty elements > <!ELEMENT name1 EMTPY > > > - elements that contain data > > <!ELEMENT name2 (#PCDATA) > > > - list elements > > <!ELEMENT name3 (name1|name2|name3|name4)* > > > - structural elements of a fixed length > > <!ELEMENT name4 ((name1|name2),name3,name4,(name5|name6|name7)) > I would go even simpler than that. Don't allow nested brackets, #4 could be represented like this: <!ELEMENT name4 (nameA,name3,name4,nameB)> <!ELEMENT nameA (name1|name2)> <!ELEMENT nameB (name5|name6|name7)> I recently wrote a quick and dirty DTD processor that generated java classes for parsing valid XML documents, using this simplification in my DTDs made things a whole lot easier. Toivo Lainevool __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one place. Yahoo! Shopping: http://shopping.yahoo.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at docuverse.com Sun Dec 12 02:31:38 1999 From: donpark at docuverse.com (Don Park) Date: Mon Jun 7 17:18:30 2004 Subject: XML Design Patterns In-Reply-To: <19991211225649.24570.qmail@web2103.mail.yahoo.com> Message-ID: <000701bf4449$1f7f48c0$d1940e18@smateo1.sfba.home.com> >--- Don Park <donpark@docuverse.com> wrote: >> These are preliminary XML design patterns >> so pattern names are weird to say the least. > >Does anyone have any references to XML design patterns, seems >like a great >idea. Great ideas are usually obvious ideas. I have been a 'pattern-head' ever since Christopher Alexander's books were mentioned in context of software engineering. Patterns in software design lead naturally to patterns in information design. XML design patterns are subpatterns of general information design patterns just like database schema design patterns. IMHO, we have greater opportunity to exploit the design patterns in information engineering than software engineering because automated application of design patterns is easier with structured information such as XML than with programming languages. XML design pattern activities are just starting to appear. I have heard that an article will appear on XML.com soon. As soon as I can find some spare time, I intend to build a repository for XML Design Patterns which will allow the XML community to pool design knowledges, to evolve them as the community evolves, and hopefully, to apply the patterns automatically using tools that use the repository as a knowledge-base. Best, Don Park - mailto:donpark@docuverse.com Docuverse - http://www.docuverse.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Dec 12 08:41:29 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:18:30 2004 Subject: XML Design Patterns Message-ID: <004e01bf4480$39caadf0$5df96d8c@NT.JELLIFFE.COM.AU> From: Toivo Lainevool <tlainevool@yahoo.com >Does anyone have any references to XML design patterns, seems like a great >idea. There are three main references. 1) The first is Ian Graham and Liam Quin's web pages "Introduction to XML Design Patterns" at http://www.groveware.com/xmlbook/patterns.html They give: * Running Text Pattern * Generated Text Pattern * Footnote Pattern * Text Blocks Pattern This is just a teaser site: I expect (or at least, I would love to see) a full book or website along those lines. 2) The second is my book: The XML & SGML Cookbook: Recipes for Structured Information Charles F. Goldfarb Series on Open Information Management Prentice Hall, 1998, ISBN 0-13-614223-0, 650 pages In researching that book I spent 1 year looking through as many DTDs as I could to try to discover the patterns they contained. I began trying to use Alexander's pattern approach (which is just as much a rhetorical form as it is a methodology) but ultimately I abandoned it because: * Many entries were trivial (e.g., Q. "When should you use a list pattern?" A. "Whenever you need a list") * Many entries only made sense after an analytical framework was first established: in fact, the first third of the book ("Systems of Documents") was spent establishing such a framework/vocabulary for the second third ("Document Patterns"). The last third of the book ("Characters & Glyphs") similarly had to cover a lot of analytical ground (e.g. the ISO Character/Glyph model) before moving on to patterns. * It was clear that the SGML literature had not even began to cover this kind of area: I am not smart enough to single-handedly establish a pattern vocabulary--indeed, it is only possible as a community effort-- so the best I could do was to try to set things up. * There was some opposition to the idea that you could usefully construct DTDs from prefabricated components, rather than by doing extensive document analysis. So the pattern approach was dismissed by some: in particular, the idea that one could appropriate elements from a "toy" DTD like HTML; of course, with the advent of namespaces, the idea of reusable vocabularies is now utterly accepted: I don't know if my book contributed to that. * Many patterns made sense only in distinction from some other type: so I moved towards a more "X versus Y" style. * Alexander gave an interview (in "Computer Languages" magazine? perhaps with Michael Swaine?) in which he said how disappointing the results of the early uses of his pattern language in architecture had been. He said that rather than creating functional and innovative buildings, people applied patterns and made buildigs that looked the same. So pattern languages seemed good for QA but not for excellence. Because I view DTDs as a tool for software engineering rather than data modeling, it made sense to try to integrate patterns into a software engineering framework, for my point of view. Here is most of the "Index of Patterns, Stuctures & Forms" from my book: * active versus passive (DTD style) * address * analytical domain * architectural form * attribute v. element (DTD style) * base (element set) * building block versus paragon (DTD style) * calendar * catalog * character versus glyph * character set * citation * class (attribute) * color * continuation paragraph * core (element set) * country code * cross-reference * data attribute for element * database versus literature (DTD style) * date * default value list * definition list * derived (element set) * description table * document * editorial structure (view) * element reference (reflection) * embedded data * encoding * fielded names * fielded text * floating elements * font * fragment * generic versus specific (DTD style) * gestural domain * hyperlink * identifier * IETM * information unit * inline versus interlaced * internal markup versus external markup (DTD style) * language * language codes * lexical type * linear versus nested (DTD style) * line * logical domain * loose versus tight (DTD style) * marketplace versus hierarchy (DTD style) * metadata * microdocument * name (ID) * nested paragraph * note (endnote, footnote, annotation, warning, caution) * occurrence * page object (View) * page layout (View) * paragraph * paragraph group (aka formal paragraph) * pool * prototypes * reusable components * ruby annotation * running text * self-labelling versus extenal labelling (DTD style) * semi-graphical text * sequence (generic) * sequence (list) * schema * stylesheet * subparagraph * table * text block * time and space * type extension * unspecified * visual domain * word segment (You can imagine my surprise when a review said that most of my book was found elsewhere, when in fact, it is still the only thing available persuing the pattern idea--though not the literature form :-( ) One very easy way to gather patterns is to look in DTDs for the things that parameter entities name. These groupings often correspond to what people may think the XML equivalent of the OO people's appropriation of Alexandar's patterns are. 3) The third is my schema language and tools: Schematron, which was designed to support patterns. The current design only allows labelling of found structures as patterns rather than specification of patterns in the abstract (that is possible to implement, but a long way off: as long as we are all fixated on grammars or classes or other implementation/modeling paradigms there is little chance of stepping back for a more general view). Available at: http://www.ascc.net/xml/schematron/schematron.html There is an interview on schematron with XML-DEV regular Simon St.Laurent at: http://www.xmlhack.com/read.php?item=121 If anyone is interested in persuing patterns further, please do any of the following: * email me or this list * * read the HTML page * buy or read my book! * read "the Gang of Four" pattern books from Addison Wesley, and also the excellent "Anti-Patterns" book from John Wiley * you can find patterns tacetly lurking in most good SGML/XML books (that are not just introductions) such as Eve Maler's or Dave Megginson's books. Rick Jelliffe P.S. In case this post is lost, Toivo Lainevool is following on from a post of Don Park on XML-DEV which mentions three "preliminary XML design patterns" to aid thinking about XML: * pockets ( elements that could provide an equivalent information set as the element/attribute distinction provides in normal XML) * parental guidence (elements that could provide an equivalent information set as provided by attributes that apply to child elements) * road signs (elements that could provide an equivalent information set to attributes or PIs (unclear, sorry)) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From t.haraguti at computer.org Sun Dec 12 12:07:34 1999 From: t.haraguti at computer.org (Tetsuharu Haraguchi) Date: Mon Jun 7 17:18:30 2004 Subject: Scope entity Message-ID: <38538F6D.9B26E9DF@computer.org> Hi, everybody! I think it is usefull to add 'the scope entity' to the linked document. Examples : <story> <public> <abstruct>Show me!</abstruct> </public> <private> <detail>Do not show me!</detail> </private> </story> or <story> <abstruct scope=public>Show me!</abstruct> <detail scope=private>Do not show me!</detail> </story> Cheers, -- Tetsuharu Haraguchi mailto:t.haraguti@computer.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Dec 12 18:09:07 1999 From: digitome at iol.ie (Sean McGrath) Date: Mon Jun 7 17:18:30 2004 Subject: Announce: Pyxie - an Open Source XML Processing Library for Python Message-ID: <3.0.6.32.19991212175738.009a52f0@gpo.iol.ie> All, I have finally got around to putting the Pyxie library up on the Web at http://www.pyxie.org. I hope some of you find it useful and help me to develop it further - either by submitting problem reports or contributing to the development effort. regards, http://www.pyxie.org - an Open Source XML Processing library for Python xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Dec 12 18:26:34 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:18:30 2004 Subject: A new XHTML PR! (1999-12-10) Message-ID: <m3aengaudc.fsf@localhost.localdomain> At the W3C site, I just noticed a new XHTML PR available at http://www.w3.org/TR/xhtml1/ I have recently left the W3C's XML activity (for personal and business reasons only -- it just took way too much time), so I am not privy to any internal discussions, but it looks like they've got it right this time around. Here are some highlights after a very, very cursory first skim (I may have gripes later after a more careful reading): 1. A single XHTML Namespace, http://www.w3.org/1999/xhtml 2. Examples of using elements from other Namespaces inside an XHTML document, and of using the XHTML Namespace inside other document types (though there are no strict conformance criteria defined for either yet). 3. All element and attribute names in lower case. 4. The DOCTYPE declaration is still required for strict XHTML conformance (annoying, but I can live with that), and there are still three different DTDs. I am particularly impressed with the HTML WG and with Tim Berners-Lee for taking the discussion (and debate) out into the open on XML-Dev rather than keeping it locked up inside the W3C cone of silence. A bit of credit should also go to us, the XML-Dev membership, who took a lot of time debating the different sides of several difficult questions. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Dec 12 21:53:06 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:18:30 2004 Subject: SAX/C++: First interface draft In-Reply-To: <14406.59198.949047.2487@localhost.localdomain> Message-ID: <NBBBJPGDLPIHJGEHAKBAAEEKEJAA.martind@netfolder.com> Hi David, Why not implement the document handler also as an interface? thus we we woudl have: class IDocumentHandler { public: virtual void setDocumentLocator (const Locator &locator) = 0; virtual void startDocument (void) = 0; virtual void endDocument (void) = 0; virtual void startElement (const char * name, const AttributeList &atts) = 0; virtual void endElement (const char * name) = 0; virtual void characters (const char * ch, size_t length) = 0; virtual void ignorableWhitespace (const char * ch, size_t length) = 0; virtual void processingInstruction (const char * target, const char * data) = 0; } class MyDocHandlerImp : public IDocumentHandler { MyDocHandlerImp(); ~MyDocHandlerImp(); void setDocumentLocator (const Locator &locator); void startDocument (void); void endDocument (void); void startElement (const char * name, const AttributeList &atts); void endElement (const char * name); void characters (const char * ch, size_t length); void ignorableWhitespace (const char * ch, size_t length); void processingInstruction (const char * target, const char * data); protected: Locator * _locator; } PRO and CON: ------------ The event generator talks to a generic interface not to a particular implementation. However this reauires that the interface.h to be included and that interfaces are inherited by implementations. Other point of view: -------------------- SP uses an event record which tend to reduce the number of interface members. In the OpenJade project, we are thinking to remake the C++ interface of OpenSP as follow: class IDocumentHandler { virtual void startElement(const StartElementEvent &event) = 0; virtual void endElement(const EndElementEvent &) = 0; } class parser: public IDocumentHandler, public SGML_XML_Application { void startElement((const StartElementEvent &event); void endElement(const EndElementEvent &); } note: the usage of an interface also allows the class implementation to use multiple inheritence and still be able to be interfaced with the client without problems, this may not be the case for an ordinary class. This is the event record which provide description of the event type. I understand that SAX saw its origins as a set of Java clas