From marcelo at mds.rmit.edu.au Mon Mar 1 00:49:03 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:32 2004 Subject: Streaming XML and SAX In-Reply-To: <36D82244.DB014ECE@thinlink.com>; from Tom Harding on Sat, Feb 27, 1999 at 08:50:12AM -0800 References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> Message-ID: <19990301114841.B4466@io.mds.rmit.edu.au> On Sat, Feb 27, 1999 at 08:50:12AM -0800, Tom Harding wrote: > David Megginson wrote: > > > No, it still looks like a messy architecture to me, because the > > transport layer has to know about the packets -- it has to parse > > the XML about to get information about what it's looking at, and > > that adds complexity and inefficiency. A clean architecture > > should separate the layers completely, and use XML only where it > > has an obvious advantage over other approaches. > > It's amazing how two people can see things so differently. I think > it's supremely elegant that only the XML processor needs to look at > data coming off the wire. It's also as efficient as it gets. Of > course the software architecture that handles the documents emitted > must be modular and extensible, but the task of parsing is done. It has already been pointed out in this discussion that some environments try to increase the throughput by dispatching documents off to different threads. A system with 50 CPU's is going to be operating as low as 2% capacity if it is forced to pipe the entire parsing load through a single thread. I don't see how you can argue that this is efficient. Nor do I agree that concentrating the workload at a single conceptual point is elegant. It is much more aesthetically pleasing to let the protocol break up packets and let the XML parser parse XML. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 02:13:44 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:32 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <36D6C618.D44846B6@thinlink.com>; from Tom Harding on Fri, Feb 26, 1999 at 08:04:40AM -0800 References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> <36D59762.370372DB@thinlink.com> <14038.35650.792155.191827@localhost.localdomain> <36D6C618.D44846B6@thinlink.com> Message-ID: <19990301131329.B6351@io.mds.rmit.edu.au> On Fri, Feb 26, 1999 at 08:04:40AM -0800, Tom Harding wrote: > David Megginson wrote: > > > -- a general-purpose DOM would be *extremely* inefficient for > > handling things like vector graphics or 3D worlds (to name only > > two), though it is always possible to expose their optimised > > object models through a DOM interface later if necessary. > > In lots of applications, the data can't stay in an XML > representation for very long anyway, because of what you're > integrating it with/displaying it on/routing it through/converting > it to/storing it in/etc... I view the DOM as a standard, OO way of > manipulating the contents of a document. It lets applications get > work done, even without taking an end-to-end OO approach. Perhaps > I'm showing my bias here ;D It's the translation process that hits hardest, however. C and FORTRAN compilers rarely build parse trees, because it is much more efficient to generate code directly from token streams. What you seem to be suggesting is that a parser should pump an event stream straight into DOM and then into another domain-specific structure. This is just adding an often gratuitous layer that can incur a massive performance penalty for large documents (a 3D model of a refinery, say). In such circumstances I would much rather build the domain-specific structure straight from the event stream. (In fact, I have serious reservations about using XML at all for 3D model transmission and storage -- the markup tends to grossly outweigh the content, which consists primarily of numbers. Compression during transport _and_ storage would be a must). Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Mar 1 03:14:12 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:32 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <000a01be6391$ddcd2750$14f96d8c@NT.JELLIFFE.COM.AU> From: Baden Hughes >I know that XML 1.0 allows you to use 'special' characters as included in >the Unicode 2.0 specification. With the upcoming release of Unicode 3.0 how >will we be able to refer to characters in 3.0 which were not in 2.0 ? The >same way (meaning the actual version of Unicode spec is irrelevant as long >as the method used is included in XML) or some new way ? > >For instance, the Sinhala character set was not in Unicode 2.0 but will be >in 3.0. How do I get one of those characters in an XML document ? Or is that >inconsequential to the document per se as it is simply a reference and its >really up to the application to render it correctly ? The document character set of XML is ISO 10646, as used by the Unicode Consortium's character set Unicode. I think most people's strong expectation is that XML will track ISO 10646, just as Unicode tracks it. In fact, I think it is essential that XML automatically tracks ISO 10646: people will always try to do strange and interesting things with characters and codes, and XML should try to allow as much freedom for them to do this as possible. Developers should be very wary of putting type-checking into their systems which will cause future legitimate ISO 10646 to fail. For example, when a new character is invented, like the Euro, the only difficulty it should cause is if the font is not upgraded or if the sort/type system doesnt allow new character registration. We certainly need to abandon the expectation the number of characters is fixed or knowable, which is how some might interpret material from Unicode Consortium: a character set standard tries to put in what is generally useful against some criteria--if your criteria do not match, then you easily legitimately decide that your character is not found in the set: is Apple's "apple" character a real character? are variant kanji characters real characters? are roman, fraktur, italic and uncial "a" characters different? Is English "W" a different character (i.e., "UU") from German "W" (i.e. "VV"), when using historical material? In my book I use a dinosaur glyph as a word have liked to have put it in the index too: why is it not a character? Such questions can never be resolved, but a character set must make a decision based on some selection criteria; and those criteria will not be appropriate in every situation. The nice thing about markup is it lets us simulate the existance of a character missing from a character set: however, we have no markup conventions yet to do this systematically. There are no standard methods for saying "when you find 'a' in this context, collate it differently" for example (apart from, perhaps, language-tagged elements). Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 05:22:48 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: Comments on WD-html-in-xml-19990224 Message-ID: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca> At 03:07 PM 2/26/99 -0500, John Cowan wrote: >1) I believe that the introduction of a media type "text/xhtml" is >a mistake. I can see this point of view. >Instead, it would be better to attach a media-type >attribute specifying the formal public identifier of the DTD. ?!? find me somewhere in a W3C or IETF document where the FPI has any standing. Standards-anality aside, this is a real problem, because there is *no interoperable resolution mechanism*. Surely you can't be serious. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 1 05:41:36 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:33 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> Message-ID: <36DA2858.43F3EA7A@thinlink.com> Marcelo Cantos wrote: > It has already been pointed out in this discussion that some > environments try to increase the throughput by dispatching documents > off to different threads. A system with 50 CPU's is going to be > operating as low as 2% capacity if it is forced to pipe the entire > parsing load through a single thread. I don't see how you can argue > that this is efficient. Even if you believe that parsing to convert markup into memory structures is slower than back-end processing, if parsing is faster than the stream itself there is no difference in the two approaches. Anyway, in the general case the question is moot because there may be inter-document dependencies, so you have to look inside the document before trying to parallelize. The whole point of this discussion was whether the document terminator ought to be XML or non-XML. Aside from the fact that I haven't yet seen a workable suggestion for a non-XML terminator, it isn't necessary to completely examine a document or convert it to a tree just to find an XML terminator. As Nathan pointed out, you could write a semi-parser to find terminators and then actually parse documents in parallel, but you'd need to suggest a way for dealing with inter-document dependencies. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Mon Mar 1 15:35:46 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:33 2004 Subject: Streaming XML and SAX In-Reply-To: <004901be6348$9c8af4a0$c9a8a8c0@thing2> Message-ID: Hi Nathan, It seems like something is backwards here! If an application is processing a series of documents, once it has a universal type name for that document (root element name + namespace), it knows how it wants to process the document and doesn't need a Pi. (What's a Gi? Is that XML?) Yes, obviously a document (or name it the way you want - I don't want to argue about streams vs documents :-) may not have any PI, may not have any name space reference. thus, only GI are then used as pattern match in this case. Sorry I forget to precise the complete resolution mechanism which is based on pattern match. thus, the router use this pattern match to dispatch to the right interpreter. Element matched are: a) PI b) name space definition c) Root GI Any of these elements could be used as a pattern match. Yes a GI is part of SGML and therefore part of XML. This is simply the element. In your example it could be something like "vendor-id". So, because the interpreter is based on a pattern matrch mechanism, everything that could be used for a pattern match can work. Actually, we use the three elements mentionned above. Also, you should be able to use the same parser for all document types and then do the routing on the parse events, saving you from having to do a "pre-parse" to determine the universal type name. Glad to see we both agree on the same mechanism. This is axactly what we do. The router mechanism is just a temporary interpreter included in the parser to load/unload the interpreters. To be precise the mechanism is: a)run the router as a special kind of interpreter b)parse the document (always) c) determine which interpreter to load then load it and let it run. d) the interpreter run until the end of the document e) at the end of the document: the router/interpreter is then loaded and run again until a new interpreter is recognized. f) got to a) The parser is always the same, only the interpreters are loaded/run and the router is just a special kind of interpreter. Do you have a more efficient mechanism to suggest? Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From boblyons at unidex.com Mon Mar 1 16:46:26 1999 From: boblyons at unidex.com (Robert C. Lyons) Date: Mon Jun 7 17:09:33 2004 Subject: Need help getting IE 5.0 Message-ID: <01BE63D8.2D163B30@cc398234-a.etntwn1.nj.home.com> IE 5.0 is no longer available on the Microsoft web site. It will be available on March 18, but I can't wait that long. I downloaded a copy of ie5setup.exe from www.download.com. When I ran ie5setup.exe, I got the following error message: "Setup was unable to download information about installation sites." (Note that the ie5setup.exe program is small, and it needs to pull many IE 5.0 components from the Microsoft web site.) Any ideas on how I can install IE 5.0 on my computer (before March 18)? Thanks. Bob ------ Bob Lyons EC Consultant Unidex Inc. 1-732-975-9877 Fax: 1-732-975-9866 boblyons(at)unidex.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Livinsb at rbos.co.uk Mon Mar 1 16:59:02 1999 From: Livinsb at rbos.co.uk (Livingstone, Stephen) Date: Mon Jun 7 17:09:33 2004 Subject: Need help getting IE 5.0 Message-ID: <217258E84FF7CF11B4630001FA44B2D502CF055A@REFROWTECX1> I have 24MB of IE5.0 files here as seperated CAB files,,, I could mail them to you if you want??(tomorrow) steven Steven Livingstone BSc MSc GradInstP Corporate Systems Development (TCN) Royal Bank Of Sctoland. mailto:livinsb@rbos.co.uk +44 0131 523 4354 [x24354] Networking Technical Associates, Glasgow, Scotland. mailto:ntw_uk@hotmail.com +44 07771-957-280 > -----Original Message----- > From: Robert C. Lyons [SMTP:boblyons@unidex.com] > Sent: Monday, March 01, 1999 4:40 PM > To: xml-dev@ic.ac.uk > Subject: Need help getting IE 5.0 > > > *** Warning : this message originates from the Internet **** > > IE 5.0 is no longer available on the Microsoft web site. > It will be available on March 18, but I can't wait that long. > > I downloaded a copy of ie5setup.exe from www.download.com. > When I ran ie5setup.exe, I got the following error message: > "Setup was unable to download information about installation sites." > > (Note that the ie5setup.exe program is small, and it needs to pull > many IE 5.0 components from the Microsoft web site.) > > Any ideas on how I can install IE 5.0 on my computer (before March > 18)? > > Thanks. > > Bob > > ------ > Bob Lyons > EC Consultant > Unidex Inc. > 1-732-975-9877 > Fax: 1-732-975-9866 > boblyons(at)unidex.com > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.. 'Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent.' xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 17:59:53 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000301be6361$272d2480$5ffa6ccb@baden> Message-ID: <36DAD563.5222F16A@locke.ccil.org> Baden Hughes wrote: > For instance, the Sinhala character set was not in Unicode 2.0 but will be > in 3.0. How do I get one of those characters in an XML document ? Or is that > inconsequential to the document per se as it is simply a reference and its > really up to the application to render it correctly ? There is a discrepancy between the prose, which says "legal Unicode/10646 characters" and references old versions of these standards, and the BNF, which says the Char production handles everything except known control characters (and even some of those). Don't worry. The problem will be resolved. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 18:24:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca> At 12:58 PM 3/1/99 -0500, John Cowan wrote: >> For instance, the Sinhala character set was not in Unicode 2.0 but will be >> in 3.0. How do I get one of those characters in an XML document ? > >There is a discrepancy between the prose, which says "legal Unicode/10646 >characters" and references old versions of these standards, and >the BNF, which says the Char production handles everything except >known control characters (and even some of those). John's right. And it's not the Sinhala that first brought it home, but the Euro character, which is clearly OK per production [2] but isn't a "legal yadda yadda yadda" per the particular amendment of 10646/Unicode that the XML spec references. The W3C has some I18n heavies trying to figure out what to do - life is made more complicated by the fact that the Unicode people and the IETF i18n people don't always point in the same direction, sigh; did you know the BOM was legal in UTF-8? And of course by the fact that Unicode/10646 is a moving target. But the bottom line is (see the public errata to the XML spec) that production [2] is normative; both in theory and in practice, XML processors pass through everything in that range. In practice, I've never actually seen anything outside of the BMP, but the experts agree they're showing up real soon now. How to get it in? Something like 𐌳 I expect. As a programmer, it'll show up either as two UTF-16 surrogates or 4+-byte UTF-8 string, neither of which will look in the slightest like hex 10333. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 18:36:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: Comments on WD-html-in-xml-19990224 References: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca> Message-ID: <36DADDF2.298060AF@locke.ccil.org> Tim Bray wrote: > ?!? find me somewhere in a W3C or IETF document where the FPI has > any standing. Standards-anality aside, this is a real problem, > because there is *no interoperable resolution mechanism*. Surely > you can't be serious. Sure I'm serious. The XHTML document (clause 3.1) gives three standard FPIs for XHTML Strict, XHTML Transitional, and XHTML Frameset, and *requires* that every strictly conforming XHTML document have a DOCTYPE that refers to one of them. The associated URL (systemid) is allowed to vary, but not the FPI. This is modeled on HTML 4.0, of course; clause 7.2 of that standard mandates the appearance of one of three FPIs as well. Similarly, HTML 3.2 (third clause) documents mandate the appearance of a single FPI, and HTML 2.0 (RFC 1866, clause 3.3) mandates the appearance of one of five FPIs. Resolution is irrelevant; it's the FPI itself that says what kind of (X)HTML you have. Table of FPIs: -//W3C//DTD XHTML 1.0 Strict//EN -//W3C//DTD XHTML 1.0 Transitional//EN -//W3C//DTD XHTML 1.0 Frameset//EN -//W3C//DTD HTML 4.0//EN -//W3C//DTD HTML 4.0 Transitional//EN -//W3C//DTD HTML 4.0 Frameset//EN -//W3C//DTD HTML 3.2 Final//EN -//IETF//DTD HTML 2.0//EN -//IETF//DTD HTML 2.0 Level 2//EN -//IETF//DTD HTML 2.0 Level 1//EN -//IETF//DTD HTML 2.0 Strict//EN -//IETF//DTD HTML 2.0 Strict Level 1//EN -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 19:10:38 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca> Message-ID: <36DAE5FA.5BA2D70E@locke.ccil.org> Timothaeus Bray scripsit: > [D]id you know the BOM was legal in UTF-8? The BOM isn't just a BOM, it's also the ZWNBSP (zero-width non-breaking space; no, I do not know how to pronounce that acronym) character, and is interpreted as a BOM only at the beginning of UCS-2 or UTF-16 documents. Not to worry; the character is as near to a no-op as Unicode allows for. > And of course by the fact that Unicode/10646 is a moving target. Only sort of. 8859-1 is theoretically a moving target too, except that all the slots are full; CP 1252 is a moving target that has just moved (by adding the euro at 0x80). In all these cases, characters can be added (in principle) but not moved or deleted (any more). > In practice, > I've never actually seen anything outside of the BMP, but the > experts agree they're showing up real soon now. Not until Unicode 4.0, unless someone wants to use the private-use planes 15 and 16. > How to get it in? Something like 𐌳 I expect. Exactly so. Or the decimal NCR equivalent. Two NCRs representing the surrogates separately would be erroneous by both Unicode/10646 definitions and XML definitions. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 19:26:18 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca> At 02:09 PM 3/1/99 -0500, John Cowan wrote: >Timothaeus Bray scripsit: > >> [D]id you know the BOM was legal in UTF-8? > >The BOM isn't just a BOM, it's also the ZWNBSP (zero-width >non-breaking space; no, I do not know how to pronounce that >acronym) character, and is interpreted as a BOM only at the >beginning of UCS-2 or UTF-16 documents. Not to worry; the character is >as near to a no-op as Unicode allows for. I think there is reason for worry. In a UTF-16 document, you can have a BOM and then the , and that PI will still be recognized as the XML declaration. The spec is, I think, pretty clear, that a ZWNBSP or any other *data* character before the XML declaration is verboten. So... it seems that in UTF8, a ZWNBSP as first character in the file isn't a data character. Blecch. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 19:43:07 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca> Message-ID: <36DAED75.86978455@locke.ccil.org> Tim Bray wrote: > So... it seems that in UTF8, > a ZWNBSP as first character in the file isn't a data character. Can you quote chapter and verse for this, either Unicode or 10646? The latter spec tells you that the sequence EF BB BF may be used as a *signature* at the beginning of UTF-8 data (since it is unlikely to occur in any other kind), but does not IMHO imply that the sequence is removable or doesn't represent a real ZWNBSP. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 19:57:25 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301115652.00c2d770@pop.intergate.bc.ca> At 02:41 PM 3/1/99 -0500, John Cowan wrote: >Tim Bray wrote: > >> So... it seems that in UTF8, >> a ZWNBSP as first character in the file isn't a data character. > >Can you quote chapter and verse for this, either Unicode or 10646? That is *exactly* the question that's now being pursued, and is I gather is in play right now in the IETF (or was that Unicode, I forget which). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From russell at latticesemi.com Mon Mar 1 20:06:20 1999 From: russell at latticesemi.com (Jerry Russell) Date: Mon Jun 7 17:09:33 2004 Subject: Announce: XML directory/search engine Message-ID: There is a new site devoted to sites and documents created in XML. You can now begin submitting your sites. The new site is at: worldwideweave.com -------------------------------------------- Jerry Russell Product Engineer Lattice Semiconductor 408-428-6400 x. 274 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Mon Mar 1 20:09:54 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:09:33 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) Message-ID: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com> Greetings, Here I am speaking for myself, not the HTML Working Group or CNET: > -----Original Message----- > From: Walter Underwood [mailto:wunder@infoseek.com] > Sent: Friday, February 26, 1999 9:42 AM > To: xml-dev@ic.ac.uk > Cc: www-html-editor@w3.org > Subject: Re: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) > The objection about thin clients or palmtops not wanting to download > large files doesn't really hold water. XML will generally be the > smallest files. Mine are almost always smaller than the corresponding > HTML. Powerpoint, PDF, JPEG -- those are big files. This is simply incorrect. The limited capabilities of thin clients and the expense of transmission of the information require capabilities-based analysis and profiling of documents on a per-client basis. As an example, consider a web-enabled cellphone such as this one: http://www.attws.com/business/pocketnet/index.html. The transmission costs to this device vary greatly worldwide, from ~$1/minute in the US to ~$22/min in Nairobi (actually you can only get basic cell phone via satellite in Nairobi, but let's pretend.) If I send a 1/2 megabyte XHTML file to this device, including its 100K CSS stylesheet, the user is entirely justified in bringing legal action against me. The page would cost many tens or hundreds of dollars to send, and of course could not be displayed. In fact the client phone would necessarily display an HTTP error message (or its equivalent) on the tiny screen. Not to mention the costs of transmitting the inevitable ~12k banner ad, which again cannot be displayed. (Information may want to be free but information providers want to get paid.) At this point in time, no method other than MIME types exists for informing the client of the type of content arriving, without first downloading the entire file and then checking it, an obvious absurdity. Doctypes, FPIs, etc. have all be suggested, but none of these solutions provides the necessary level of transaction control required to identify the content prior to content reception. Given the massive costs involved, the client must always be allowed to reject content prior to downloading the entire file. > Adding an XML-specific HTTP header line makes HTTP 1.1 more complex > (shudder), and imposes an extra coding and testing burden on HTTP > implementations. Also, it does nothing for XHTML over other > transports, > like SMTP or FTP. It is also introducing a new set of dependencies for all XML documents. Not feasible. > Essentially, this is document information, not protocol information. > It belongs in the document. To describe the document out-of-line, > use RDF, not HTTP headers. Thin clients will almost necessarily reject all RDF documents (and most XML documents in general). RDF is complex and experimental; I am unconvinced that a cell phone should have to deal with it. > Pragmatically, HTTP Content-type isn't even reliable. Somebody will > decide that Excel and XML are the same thing, and start serving > spreadsheets as text/xml. Cell phones have to deal with that world, > and adding things to the HTTP spec doesn't fix ignorant sysadmins. True; unfortunate; costly for the victims; possibly legally actionable. > XHTML Spec comment: the spec doesn't mention application/xml. > It should. > If application/xml is never appropriate for XHTML (say, the UTF-16 > encoding is forbidden), then say so. The XHTML spec is very clear on this, explicitly stating the MIME types that can be used. No other MIME types are *ever* appropriate. With MIME types being used for document type identification, sending a document with the wrong MIME type guarantees an error. > > XHTML Spec comment: Are the Strict, Transitional, and Frameset DTDs > subsets or extensions? Or neither? Is one a subset of another? These > intentions should be spelled out in the spec so that future versions > won't break them. > The 3 XHTML DTDs are neither subsets or extensions in a literal sense. They correspond as closely as possible to the HTML 4.0 DTDs of the same names. While to some extent the 'strict' DTD is a subset of the other two, it also uses different content models for elements with the same name. Once could not, for practical purposes, use it as an external subset and include the frameset DTD as an internal DTD subset without conflict between their content models. I will not attempt to justify the division of HTML into these 3 groupings - this was decided by the HTML 4.0 committee and is loosely justified by the HTML 4.0 specification. Current attempts are designed to follow this existing prior art to the greatest extent possible. Regards, D- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Mar 1 21:04:56 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions Message-ID: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> before you scream, this isn't about style sheets, and it isn't about attributes vs. elements. rather, this is more how to structure your document/data. any words of wisdom regarding: 1) having an extra collection layer in the xml tree, like vs. > 2) having PCDATA vs. having a distinct "comment" or "description" element child: this is the description of this thing vs. this is the description of this thing -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Mon Mar 1 21:26:47 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions References: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> Message-ID: <36DB05E6.B4CD1C4D@allette.com.au> Mark D. Anderson wrote: > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, like > > vs. > > Would you consider and to be siblings? If so, I wouldn't compartmentalise. Alternatively, if can appear after but these have different significance, I would compartmentalise. > 2) having PCDATA vs. having a distinct "comment" or "description" element child: > > this is the description of this thing > > > vs. > > this is the description of this thing > > If you are going to have a need to deal with in some way and it could get mixed up with other #PCDATA, I'd create an element. My instinct would be to mark it up as an element unless the overhead was excessive, but I think that sort of thing is driven by (a) immediate or forseeable requirements, followed by (b) personal taste. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Mon Mar 1 22:20:52 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:34 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> Message-ID: <36DB1161.893184EE@eng.sun.com> roddey@us.ibm.com wrote: > > Does the following violate the 'partial markup in entity' rule of XML? > > > "> > > %Whole; I'll assume you intended to work with parameter entities; then as Richard pointed out this can be legal ... if the three syntax errors are corrected (" So I'm assuming > that this is ok, that the prohibition against partial markup refers to the > eventual use of the entity, not to the definition thereof? Right -- this would violate _validity_ constraints (but a nonvalidating parser should accept it just fine): "> %Part1;%Part2; ]> Another way to make an error out of your declarations is to make the PEs be external, not internal -- then they'd not match full grammatical productions. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 22:32:12 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:34 2004 Subject: Streaming XML and SAX In-Reply-To: <36DA2858.43F3EA7A@thinlink.com>; from Tom Harding on Sun, Feb 28, 1999 at 09:40:40PM -0800 References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com> Message-ID: <19990302093128.A19583@io.mds.rmit.edu.au> On Sun, Feb 28, 1999 at 09:40:40PM -0800, Tom Harding wrote: > Marcelo Cantos wrote: > > > It has already been pointed out in this discussion that some > > environments try to increase the throughput by dispatching > > documents off to different threads. A system with 50 CPU's is > > going to be operating as low as 2% capacity if it is forced to > > pipe the entire parsing load through a single thread. I don't see > > how you can argue that this is efficient. > > Even if you believe that parsing to convert markup into memory > structures is slower than back-end processing, if parsing is faster > than the stream itself there is no difference in the two approaches. That is an awfully big _if_ to enshrine in a standard (if that's where all this broo-ha-ha ultimately ends up). What if client and server are on the same machine? > Anyway, in the general case the question is moot because there may > be inter-document dependencies, so you have to look inside the > document before trying to parallelize. The question is far from moot since an enormous class of very interesting problems does not fall into this category. There are myriad applications for self-contained XML packets. Furthermore, inter-document dependenies are not a fundamental problem for parallelisation. Threads can talk to each other and block waiting for other threads to finish parsing, while allowing other threads to continue independent tasks. You are suggesting that because in some cases it isn't trivial to parallelise we should therefore never even allow the possibility of such a thing to occur. > The whole point of this discussion was whether the document > terminator ought to be XML or non-XML. Aside from the fact that I > haven't yet seen a workable suggestion for a non-XML terminator, I am frankly incredulous that there are no systems, protocols or standards available today that adequately address the need to stream multiple logical units of information. This is not a new problem. Let me suggest one off the top of my head: send a null terminated decimal length, followed by a document. This is sufficient to dispatch data to multiple threads and raise concurrency levels. Any further processing can be done inside the parsers. > it > isn't necessary to completely examine a document or convert it to a > tree just to find an XML terminator. You can do better than a well-formedness parser? What are you going to do, grep for ? > As Nathan pointed out, you > could write a semi-parser to find terminators and then actually > parse documents in parallel, but you'd need to suggest a way for > dealing with inter-document dependencies. You get the threads to talk. Inter-document dependencies are not and need not be a protocol issue. At the end of the day, the problem of streaming documents is not a difficult one to solve at the protocol level (HTTP-NG will have it built in, AFAIK). Why do you want to complicate life by overloading the parser's job? Actually, my real question is, what on earth do you hope to gain? Or is this just a philosophical preference thing? Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MarkM at SapphireGroup.com Mon Mar 1 23:02:22 1999 From: MarkM at SapphireGroup.com (Mark Murphy) Date: Mon Jun 7 17:09:34 2004 Subject: Looking for XML Filtering Projects Message-ID: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com> At XTech '99, I am delivering a presentation on information filtering applied to XML -- given a source of new/changed XML-encoded data, determining which of a set of people are interested in that XML based on filter criteria. I want to make sure I mention any relevant work in this area, besides my own and other projects I'm already aware of (e.g., XTenit.com, XML-enabled search tools like sgrep). If you are working on information filtering applied to XML, and you would like your project mentioned at XTech '99, please send me an e-mail (MarkM@SapphireGroup.com) with relevant details, and I'll be sure to include you in my presentation! Mark L. Murphy The Sapphire Group, Inc. MarkM@SapphireGroup.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 1 23:33:36 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:34 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com> <19990302093128.A19583@io.mds.rmit.edu.au> Message-ID: <36DB2399.BC94B7E4@thinlink.com> Marcelo Cantos wrote: > Furthermore, inter-document dependenies are not a fundamental problem > for parallelisation. Threads can talk to each other and block waiting > for other threads to finish parsing, while allowing other threads to > continue independent tasks. You are suggesting that because in some > cases it isn't trivial to parallelise we should therefore never even > allow the possibility of such a thing to occur. I was not suggesting that. I merely said that in the general case, knowing how to parallelize requires looking at the data in the stream. I propose that this data, like everything else, be stored in XML and that before doing anything else, the endpoint ought to parse it. I'm sorry if I gave the impression that I think XP is the solution to everything. I merely think it would be useful for a lot of things. If you're judging it on the criteria of being able to accomplish something that was impossible before, I'm not surprised you're disappointed. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 23:39:37 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:34 2004 Subject: Looking for XML Filtering Projects In-Reply-To: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com>; from Mark Murphy on Mon, Mar 01, 1999 at 06:02:08PM -0500 References: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com> Message-ID: <19990302103859.B19583@io.mds.rmit.edu.au> On Mon, Mar 01, 1999 at 06:02:08PM -0500, Mark Murphy wrote: > At XTech '99, I am delivering a presentation on information filtering > applied to XML -- given a source of new/changed XML-encoded data, > determining which of a set of people are interested in that XML based on > filter criteria. > > I want to make sure I mention any relevant work in this area, besides my own > and other projects I'm already aware of (e.g., XTenit.com, XML-enabled > search tools like sgrep). Our database server (SIM) has a facility for querying a database at regular intervals. The results are masked with a last-modified filter, which is updated each time the query is issued. This means that users can run a session, build up queries (either by creating new ones, or merging prior result sets with boolean operators) and then save them. They can then have those saved queries executed regularly on any new or changed data and a notification sent to them in an appropriate manner (e.g. an emailed page of abstracts and accompanying links). The beauty of this approach is that is conflates the concept of filter and query. Hence, users wishing to filter documents for items of interest have the full expressive querying power of the database with which to define their peculiar interests. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dalapeyre at mulberrytech.com Mon Mar 1 23:47:28 1999 From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions In-Reply-To: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> Message-ID: Mark Anderson wrote: >any words of wisdom regarding: >1) having an extra collection layer in the xml tree, like > >vs. >> If you have ANY reason to think you may need the collection layer, put it in. Reasons you might want it include things like: a) Reuse - s are frequently used together and you want electronic cut-and-paste and/or even a really stupid parsing algorithm to be able to find them all easily. The converse is the same, if you want to ignore all s, group them. b) You need some sort of behavior or formatting at the collection level. This could be as simple as wanting a new indent level in the generated toc. This is the most common reason in practice. c) For correct hierarchical layering, s just aren't as big and important as s so they don't belong at the same level. etc. Yes, much of this could also be done by asking if you are the first among your siblings, etc. But sometimes event-driven processing is easier or faster than tree walking, and a containing element gives you your event. >2) having PCDATA vs. having a distinct "comment" or "description" element >child: >this is the description of this thing >> >vs. >this is the description of this thing > As a style issue, I favor the explicit description. Makes programming life easier all around, costs next to nothing. Programs can easily find the two equivalent, but, in my experience, people don't. --Debbie ====================================================================== Deborah Aleyne Lapeyre mailto:dalapeyre@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9633 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Tue Mar 2 00:58:04 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:34 2004 Subject: HyBrick Support for XPointer Message-ID: <3.0.5.32.19990301165634.00a7f3a0@pophost.fsc.fujitsu.com> Previous announcements of HyBrick's support for XPointer have not detailed which features are supported. One reason of course is that the discussion of XPointer continues within the W3C WG. With the announcement of the most recent version of HyBrick resulting in a significant number of downloads, it looks like a good time to state which features are availble. Based on the March, 1998 XPointer draft, HyBrick users can test: - All absolute loc terms: root(), html(), id(), origin() - All relative loc terms: child(), ancestor(), descendant(), following() preceding(), fsibling(), psibling() - The attr() loc term Quick Intro: psibling Example Here's a quick introduction to using these features: - Go to the Samples\XLink-sample directory - Open the readme.xml file - Inside the first xlink element, under the first locator element: insert: - In the first p element start tag after Overview, add the attribute/value pair id="p6". - Go to the dtd directory and open the sample.dtd file. - Add after the > Call for Participation to one of the workshops of WET ICE > > IEEE 8th International Workshops on Enabling Technologies: > Infrastructure for Collaborative Enterprises. > > 16-18 June 1999 > Stanford University, California USA > > For more information: http://www.ida.liu.se/conferences/WETICE/ > ______________________________________________________ > > WET ICE Workshop on Integrating XML and Distributed Object Technologies > > For more information: http://www.cerc.wvu.edu/workshop2/xmlobjects.html > > Call for Papers and Workshop Description > > The Internet world is being transformed before our eyes as open standards > such as > XML are being rapidly adopted. The XML technologies are being seen as > harbinger of various new functionality in numerous domains ranging from > electronic commerce to electronic publishing to healthcare delivery to > manufacturing to > insurance. Various object-oriented technologies and standards such as Java, > CORBA and DCOM have also progressed rapidly in the past few years. At this > time, > the industry and academia are seriously looking at the intersection of these > technologies and what it means to the future of the object-web paradigm. > This > workshop aims to bring together participants who are seriously investigating > the combined use of these technologies to support practical application > needs > in a variety of domains. The goal of this workshop is to investigate how XML > and Distributed Object technologies such as Java, CORBA and DCOM can be > integrated leveraging the strengths each have to offer. > > Integrating XML and Distributed Object technologies > Advances in XML: DOM, SAX, XSL, Schemas, XLink as it relates to Objects > Advances in CORBA 3.0, Java, DCOM as it relates to XML > Tools and utilities that facilitate integration of XML and > object-technologies > Application of XML and Object technologies in E-commerce, Finance, > Healthcare, > Publishing, Insurance and Manufacturing and System Integration. The > purpose of these examples should be to show specific successful integration > approaches of XML and objects. > > Workshop Chairs: > > V. "Juggy" Jagannathan > Concurrent Engineering Research Center > West Virginia University > P.O. Box 6506 > Morgantown, WV, USA 26506-6506 > Email: juggy@cerc.wvu.edu > > Matthew Fuchs > Veo Systems, Inc. > Email: matt@veosystems.com > > ____________________________________________________________________________ > ______ > > About WET ICE > > WET ICE is an annual, international forum for state-of-the-art research in > enabling > technologies for collaboration. > > WET ICE '99 will consist of parallel, three-day workshops on different > topics related > to collaboration technology. Each workshop will include paper presentations > and working > group discussions, with additional joint keynote sessions and a final joint > session > to summarize each groups' findings. > > What sets WET ICE apart from larger conferences is that the workshops are > kept > small enough to promote fruitful discussions on the > latest technology developments, directions, problems, and requirements. Each > group > will produce a summary report which will appear in the post-proceedings to > be published > by IEEE Computer Society Press. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Tue Mar 2 02:31:41 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:09:34 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp> John Cowan writes: >Tim Bray writes: >> In practice, >> I've never actually seen anything outside of the BMP, but the >> experts agree they're showing up real soon now. > >Not until Unicode 4.0, unless someone wants to use the private-use >planes 15 and 16. It is my understanding that Unicode 3.0 will have many ideographic characters which are outside of the BMP. >John Cowan writes: >Tim Bray writes: >> So... it seems that in UTF8, >> a ZWNBSP as first character in the file isn't a data character. > >Can you quote chapter and verse for this, either Unicode or 10646? >The latter spec tells you that the sequence EF BB BF may be used as >a *signature* at the beginning of UTF-8 data (since it is unlikely >to occur in any other kind), but does not IMHO imply that the >sequence is removable or doesn't represent a real ZWNBSP. Attached is quoted from A2 of N1396 ISO/IEC 10646-1 Corrigendum no. 2 (First draft - revised to 30 April 1996), which was (is?) available at http://osiris.dkuug.dk/JTC1/SC2/WG2/docs/N1396.doc The para most relevant to your question is: >An application receiving data may either use these signatures to >identify the coded representation form, or may ignore them and treat >FEFF as the ZERO WIDTH NO-BREAK SPACE character. How do you interpret this "or"? One could argue that when EF BB BF is recognized as a signature, it is not treated as the ZWNS. Unfortunately, every description about the BOM (even for UCS-2 or UTF-16) is unclear and subject to different interpretations, as I see it. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp --------------------------------------------------------- Annex F (informative) The use of "signatures" to identify UCS This annex describes a convention for the identification of features of the UCS, by the use of "signatures" within data streams of coded characters. The convention makes use of the character ZERO WIDTH NO-BREAK SPACE, and is applied by a certain class of applications. When this convention is used, a signature at the beginning of a stream of coded characters indicates that the characters following are encoded in the UCS-2 or UCS-4 coded representation, and indicates the ordering of the octets within the coded representation of each character (see 6.3). It is typical of the class of applications mentioned above, that some make use of the signatures when receiving data, while others do not. The signatures are therefore designed in a way that makes it easy to ignore them.?In this convention, the ZERO WIDTH NO-BREAK SPACE character has the following significance when it is present at the beginning?of a stream of coded characters: UCS-2 signature: FEFF UCS-4 signature: 0000 FEFF UTF-8 signature: EF BB BF UTF-16 signature: FEFF An application receiving data may either use these signatures to identify the coded representation form, or may ignore them and treat FEFF as the ZERO WIDTH NO-BREAK SPACE character. If an application which uses one of these signatures recognises its coded representation in reverse sequence (e.g. hexadecimal FFFE), the application can identify that the coded representations of the following characters use the opposite octet sequence to the sequence expected, and may take the necessary action to recognise the characters correctly. NOTE - The hexadecimal value FFFE does not correspond to any coded character within ISO/IEC 10646. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Mar 2 03:29:42 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:35 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) In-Reply-To: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com> Message-ID: <001601be645c$24395ae0$d3228018@jabr.ne.mediaone.net> Daniel Austin wrote: > > At this point in time, no method other than MIME types exists for > informing the client of the type of content > arriving, without first downloading the entire file and then > checking it, an > obvious absurdity. Doctypes, FPIs, > etc. have all be suggested, but none of these solutions provides the > necessary level of transaction control required to identify the content > prior to content reception. Given the massive costs involved, the client > must always be allowed to reject content prior to downloading the entire > file. Please explain what: Content-type: text/xhtml can possibly do for you that: Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" cannot do. (Note: the use of doctype = dtd is an example, the doctype can point to any URI. Just like the XML namespace URI, the doctype URI serves as a unique identifier and implies no particular meaning. > > > > > Adding an XML-specific HTTP header line makes HTTP 1.1 more complex > > (shudder), and imposes an extra coding and testing burden on HTTP > > implementations. Also, it does nothing for XHTML over other > > transports, > > like SMTP or FTP. > > > It is also introducing a new set of dependencies for all XML > documents. Not feasible. Huh!? Both these statements are patently false. As per the RFC 822 and following specs, inclusion of a new header does not in any way alter the syntax of HTTP or SMTP. It is specifically allowed. Both SMTP and HTTP can deal with headers, FTP of course could care less about text/xhtml or any other MIME header so this is moot. The point is to create a generalizable mechanism for content negotiation depending on an XML namespace or DTD or Schema. XHTML like HTML 1.0 - HTML 4.0 is a soon to be historical oddity. I have nothing against HTML, just why create a hack to solve a particular problem for XHTML version 1.0 e.g. text/xhtml, when a generalizable solution can be created for any XML document type e.g. text/xml; doctype=".../XHTML10.dtd". This gives the best of both worlds. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 2 08:49:06 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:35 2004 Subject: xml style questions Message-ID: <01BE6490.F0D17680@grappa.ito.tu-darmstadt.de> Mark D. Anderson wrote: > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, like > > vs. > > Another reason for a collection layer is human readability. This is especially important if the document is normally edited/read by humans, less so if it is designed only to be written/read by machine. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stefan at objectfarm.org Tue Mar 2 11:45:24 1999 From: stefan at objectfarm.org (Stefan Kreutter) Date: Mon Jun 7 17:09:35 2004 Subject: XPointer question Message-ID: Hello there! given the following XML-snippet: Bart Simpson Homer Simpson can I use th following XPointer to get the customer ID of Bart Simpson: root().child(all, customer).child(1,name).string(1, "Bart Simpson").ancestor(1, customer).attr(id) I guess this sould work since the XPointer grammar allows to place OtherTerm after a StringTerm, but I'm not sure if I understood the spec completely. Since string() might return portions of multiple nodes (see 3.7 of WD-xptr-19980202) applying ancestor() seems a little problematic. BTW is there a typo in the XPtr-spec? In grammar rule [2] it says: [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerm shouldn't that be: [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerms this would allow XPointers of any length not just one or two OtherTerms. -Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1026 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/b5d13faa/attachment.bin From msabin at cromwellmedia.co.uk Tue Mar 2 12:07:28 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:35 2004 Subject: Encoding detection again ... Message-ID: I've been browsing throught the archives for an answer to this question, but I haven't been able to find anything that seems to give a completely unambiguous answer ... Appendix F of the spec say that given a document starting with the 4 octet sequence, 00 3C 00 3F I'm to infer BOM-less big-endian UTF-16, and given a document starting with, 3C 00 3F 00 I'm to infer BOM-less little-endian UTF-16. What I what to know is: why could these sequences not equally represent (respectively) big-endian UCS-2 or little-endian UCS-2? In other words, surely these octet sequences are ambiguous, and hence the encoding should be resolved definitively with either, or, or an appropriate MIME header, ie., Content-type: text/xml; charset="utf-16" or, Content-type: text/xml; charset="ISO-10646-UCS-2" Just so there's no confusion ... I'm assuming: 1. Unicode == UTF-16 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's support for characters outside the BMP). -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Mar 2 13:41:05 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:35 2004 Subject: xml style questions Message-ID: <93CB64052F94D211BC5D0010A80013310EB351@wwmessd3.bra01.icl.co.uk> > > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, > 2) having PCDATA vs. having a distinct "comment" or > "description" element child: Firstly, the extra markup can be used to impose extra validity constraints, which means you application has to do less checking. Secondly, the extra markup can make XSL stylesheets a lot easier to write. (In fact, without it they can be impossible...) So if you're auto-generating the XML and if space isn't at a premium I would include the extra tags. If it's manually edited it's a different story... Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 2 14:39:40 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:35 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com> Message-ID: <36DBF7DE.27FAFCD0@locke.ccil.org> David Brownell wrote: > Right -- this would violate _validity_ constraints (but a nonvalidating > parser should accept it just fine): > > > "> > > %Part1;%Part2; > ]> > > It's not 100% clear to me whether the reference to Part2 violates the WFC "PEs in Internal Subset", which states (inter alia) that "parameter-entity references can occur only where markup declarations can occur". After "%Part1;" which resolves to " Message-ID: <36DC012D.FAA63A78@locke.ccil.org> Jonathan Borden wrote: > Please explain what: > > Content-type: text/xhtml > > can possibly do for you that: > > Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" > > cannot do. (Note: the use of doctype = dtd is an example, the doctype can > point to any URI. Just like the XML namespace URI, the doctype URI serves as > a unique identifier and implies no particular meaning. I agree, except that I would prefer to see an FPI rather than (or in addition to) a URI. That would be extensible to HTML as well as XHTML, and therefore to the text/html media type as well as the text/xml media type. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 2 15:40:27 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:35 2004 Subject: XML and special Characters : unicode v3.0 ? References: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp> Message-ID: <36DC062C.73214454@locke.ccil.org> MURATA Makoto wrote: > It is my understanding that Unicode 3.0 will have many ideographic > characters which are outside of the BMP. The Unicode Consortium has indicated on its mailing list that no non-BMP characters will appear in Unicode 3.0. (Unless Vertical Extension A is being put in Plane 2 after all?) > >An application receiving data may either use these signatures to > >identify the coded representation form, or may ignore them and treat > >FEFF as the ZERO WIDTH NO-BREAK SPACE character. > How do you interpret this "or"? I interpret it as "inclusive or", "and/or", "vel". > One could argue that when EF BB BF > is recognized as a signature, it is not treated as the ZWNS. I think that it may or may not be treated as the ZWNBSP. In any event, the whole annex is informative, and describes "a convention [...] applied by a certain class of applications". It is reasonable to suppose that XML is not in that class of applications, at least so far as UTF-8 recognition is concerned. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ajd100 at NAmerica.mot.com Tue Mar 2 16:54:50 1999 From: ajd100 at NAmerica.mot.com (Dutra Juliana-AJD100) Date: Mon Jun 7 17:09:35 2004 Subject: FW: Voice XML Message-ID: <11EF19296147D211A7C100805F312AE7C027A0@s-il06ar.corp.mot.com> fyi... > Chiming in on voice standards: AT&T, Lucent Technologies and Motorola will > announce today joint cooperation on a software language that allows users > to access the Internet by voice The companies are hoping that the > language, called VXML, which stands for voice extensible markup language, > will become a standard for voice commands to the Internet. > > http://www.msnbc.com/news/245787.asp > > Juliana Dutra - E-Business Strategies > ===================================== > Motorola, Communications Enterprise, MMS > Loc: IL06, Phone = 847-538-3101 Fax = 847-538-7791 > Intranet = http://mms.mot.com/ebusiness/ > ===================================== > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kyu-hwang.yeon at bauer-partner.de Tue Mar 2 18:08:26 1999 From: kyu-hwang.yeon at bauer-partner.de (Kyu Hwang Yeon) Date: Mon Jun 7 17:09:35 2004 Subject: I wonder ... Message-ID: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> Hi I am looking for a way to reuse *.dtd files. For example, I have book.dtd and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd without rewriting whole library.dtd. (Maybe it is too silly question for people who subscribe this new group) I wonder it is possible? Otherwise, should certain conditions be satisfied for that reuse? Best regards, Kyu Hwang xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Tue Mar 2 19:18:38 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:09:35 2004 Subject: XML behind XMLBars Message-ID: <9C998CDFE027D211B61300A0C9CF9AB442470A@SNYC11309> Hi everybody, Let me present to the community XMLBars: XML driven menu bars. I intended it to serve as a simple and visually perceivable example of using XML to facilitate web design issues. Seems that it turned into a nice web GUI tool. I would highly appreciate your judgment and critique. Your contribution is very welcome. Here is my sin: Namespaces are used to point to document fragments collection (rather then element definitions) Why not? It is more convenient for me to say then to use XPointer or Entity: It is easier for people to read (not only parsers matter). By changing namespace ( URN ) all the references defined with its alias will change automatically. It is also great for internalization. And, of course, I can define multiple namespaces of fragments. The fact that URN doesn't have to be a real URL makes the possibilities even greater. ----------------- XML behind XMLBars Menu Markup Language, if I may :) Menu bar rendering and formatting information is stored in XML and cashed in DOM by a parser. Submenus are rendered only when parent menu is activated. Action to be fired on a menu click event is also stored in XML. Action can be a Link to a web page or a chunk of JavaScript code. It can also be a Sub-Action. In this case child menu inherits parents action. Action can be parameterized. For example in the following fragment xml-dev archive http://www.lists.ic.ac.uk/hypermail/xml-dev//index.html 1999 99 January 01< /SUB> February 02 two leaf submenus when clicked will point to: http://www.lists.ic.ac.uk/hypermail/xml-dev/9901/index.html and http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/index.html Most of magazines and monthly publications have similar structure. Reusable group of 12 submenu -months will help. The 3 years of XML-DEV archive will be as short as: xml-dev archive http://www.lists.ic.ac.uk/hypermail/xml-dev//index.html 1999 99 1998 98 1997 97 The second optional attribute xql:select will filter first 4 months for current year and months starting with February for the year 1997. XMLBars implemented using IE5beta parser can be found at http://www.cogx.com/XMLBar. (Sorry, still working on cross-browser implementation). Nikita Ogievetsky Cogitech Inc. http://www.cogx.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Tue Mar 2 19:39:08 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:09:35 2004 Subject: I wonder ... Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> >I'd like to reuse book.dtd inside library.dtd without rewriting >whole library.dtd. (First, this is really a question for the xml-l list or comp.text.xml. xml-dev is for people developing XML software.) This is what external parameter entities are for. Parameter entities store pieces of a DTD, and "external" means "stored in a separate file" (or the equivalent construct in your operation system). For example, if book.dtd is the following: your library.dtd file could look like this: %bookdtd; Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Mar 2 20:12:31 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:36 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) Message-ID: <008201be64e8$0fc8ca00$0b2e249b@fileroom.Synapse> John Cowan wrote: >Jonathan Borden wrote: > >> Please explain what: >> >> Content-type: text/xhtml >> >> can possibly do for you that: >> >> Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" >> >> cannot do. (Note: the use of doctype = dtd is an example, the doctype can >> point to any URI. Just like the XML namespace URI, the doctype URI serves as >> a unique identifier and implies no particular meaning. > >I agree, except that I would prefer to see an FPI rather than (or >in addition to) a URI. That would be extensible to HTML as well as >XHTML, and therefore to the text/html media type as well as the >text/xml media type. > This is a good idea. A general way to employ the Content-type header to specify a document type is: Content-type: text/xml; element="html"; fpi="-//W3C//DTD XTHML 1.0 Strict//EN"; uri="http://www.w3.org/XHTML.DTD" This should apply to text/html, text/xml, text/sgml, application/xml etc. deja vu all over again :-) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Mar 2 20:22:01 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:36 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com> <36DBF7DE.27FAFCD0@locke.ccil.org> Message-ID: <36DC4828.C133D8F2@goon.stg.brown.edu> John Cowan wrote: > > Right -- this would violate _validity_ constraints (but a nonvalidating > > parser should accept it just fine): > > > > > > > "> > > > > %Part1;%Part2; > > ]> > > > > > > It's not 100% clear to me whether the reference to Part2 violates > the WFC "PEs in Internal Subset" To restate your message slightly: The problem with %Part2; is that the markup unit starts with %Part1; and ends with %Part2;, which is something parsed entities aren't supposed to do. Note: The only reason you can get away with is that section 4.3.2 of the XML 1.0 standard says that all internal parsed entities are by definition well formed. It's apparently an exception to the "proper nesting" rule, meant spe- cifically to allow cutting and pasting of parameter entities. This is also the motivation for suppressing the addition of spaces before and after the entities inside the quotation marks above. Would anyone agree that the standard is not altogether clear on this point? (Tim, if my comments are correct, it might make sense to edit them, in some form, into your annotated version of the spec.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From slotter at maya.com Tue Mar 2 20:40:55 1999 From: slotter at maya.com (Dave Slotter) Date: Mon Jun 7 17:09:36 2004 Subject: Expat API In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> Message-ID: Hi. I'm new to this list (just subscribed today) and searched the archives on expat, but it failed to answer my question. My question is: where is the documentation on how to use the expat API? I downloaded version 1.0.2 and ported the code to run the sample program on my Macintosh, but I'm pretty much dead in the water. I tried sending email to the author (James Clark) twice in the last few days, but I have so far failed to receive a response. The comments in the header files do not seem to be sufficient. What I am trying to do is parse some well-formed XML such as the following example so that I can get the tags (which the example shows me how to do) and then obtain the text. ----- cat gray ----- For example, I would like to be able to obtain the TAG as well as the FOO ID (12345678), then the tag along with the enclosed text (cat) then the tag along with its enclosed text (gray). However, the sample program only shows how to retrieve the tags. If anyone has some example code, I would be grateful. If someone has documentation, that would be appreciated as well. -Dave Slotter xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 2 21:10:33 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> References: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> Message-ID: <14044.21265.752204.753493@localhost.localdomain> Kyu Hwang Yeon writes: > I am looking for a way to reuse *.dtd files. For example, I have book.dtd > and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd > without rewriting whole library.dtd. (Maybe it is too silly question for > people who subscribe this new group) I wonder it is possible? Otherwise, > should certain conditions be satisfied for that reuse? Try this: %book; All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Tue Mar 2 21:22:34 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... References: Message-ID: <36DC55FF.64C4408D@Eng.Sun.COM> Miles Sabin wrote: > > Appendix F of the spec say that given a document > starting with the 4 octet sequence, > > 00 3C 00 3F > > I'm to infer BOM-less big-endian UTF-16, and > given a document starting with, > > 3C 00 3F 00 > > I'm to infer BOM-less little-endian UTF-16. That is, the appendix _suggests_ (in a non-normative fashion) that's the way to go. > What I what to know is: why could these > sequences not equally represent (respectively) > big-endian UCS-2 or little-endian UCS-2? They could ... > > 1. Unicode == UTF-16 > 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's > support for characters outside the BMP). Put it this way: if you assume UTF-16, you're safe either way because UTF-16 is a superset. It'd be reasonable for an autodetecting algorithm to support "downgrading" its guess from UTF-16 to UCS-2, and should probably do so if it's reporting encoding mismatches as fatal errors. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Mar 2 21:48:18 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <14044.21265.752204.753493@localhost.localdomain> Message-ID: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> This works fine, but (at least in IE 5) only for a single level. That is, you can't have another entity reference inside "book.dtd". To me, this significantly limits its usefulness (imagine not allowing a #include inside a file that was #included). Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of David Megginson Sent: Tuesday, March 02, 1999 1:09 PM To: XML Development Subject: re: I wonder ... Kyu Hwang Yeon writes: > I am looking for a way to reuse *.dtd files. For example, I have book.dtd > and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd > without rewriting whole library.dtd. (Maybe it is too silly question for > people who subscribe this new group) I wonder it is possible? Otherwise, > should certain conditions be satisfied for that reuse? Try this: %book; All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 2 21:51:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> References: <14044.21265.752204.753493@localhost.localdomain> <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> Message-ID: <14044.23685.951735.30695@localhost.localdomain> Jeffrey E. Sussna writes: > This works fine, but (at least in IE 5) only for a single > level. That is, you can't have another entity reference inside > "book.dtd". To me, this significantly limits its usefulness > (imagine not allowing a #include inside a file that was #included). If IE 5 behaves this way, it is because of a bug, not because of a limitation in the XML spec -- since XML support in IE is in early days, I expect that Microsoft will fix this problem before the official release. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Clark.Cooper at corporate.ge.com Tue Mar 2 23:04:11 1999 From: Clark.Cooper at corporate.ge.com (Cooper, Clark (CORP, Consultant)) Date: Mon Jun 7 17:09:36 2004 Subject: Expat API Message-ID: <014CB98EB81ED011B3E900805FE2D47A04F74B42@X01SCHCORPGE> Dave Slotter wrote: > My question is: where is the documentation on how to use the expat > API? I downloaded version 1.0.2 and ported the code to run the sample > program on my Macintosh, but I'm pretty much dead in the water As far as I know the include file is the documentation. Expat is used by the perl module XML::Parser, which I maintain, but if you're having trouble with just the include file, you'd be absolutely lost looking at Expat.xs (I get lost looking at it sometimes). If you can use perl, I'd like to suggest XML::Parser as a kindler, gentler interface to expat. If you're not a perl kinda fella, here's a small example of using expat: #include "xmlparse.h" #include #include #define MAXLEV 512 #define BUFSIZE 4096 char indent[(MAXLEV + 1) * 2]; int level = 0; void start(void *data, const XML_Char *name, const XML_Char **atts) { int offset; printf("\n%s> %s", indent, name); while (*atts) { printf(" %s='%s'", atts[0], atts[1]); atts += 2; } if (level >= MAXLEV) { fprintf(stderr, "Exceeded max level\n"); exit(-1); } offset = level * 2; indent[offset] = ' '; indent[offset + 1] = ' '; indent[offset + 2] = '\0'; level++; } /* End start handler */ void end(void *data, const XML_Char *name) { level--; indent[level*2] = '\0'; printf("\n%s< %s\n", indent, name); } /* End end handler */ void text(void *data, const XML_Char *txt, int len) { int i; printf("%s- ", indent); for (i = 0; i < len; i++) putchar(txt[i]); } /* End text handler */ void main(int argc, char **argv) { XML_Parser prs; int stat; FILE * doc; if (argc < 2) { fprintf(stderr, "No filename supplied\n"); exit(-1); } doc = fopen(argv[1], "r"); if (! doc) { fprintf(stderr, "Couldn't open %s\n", argv[1]); exit(-1); } indent[0] = '\0'; prs = XML_ParserCreate(NULL); XML_SetElementHandler(prs, start, end); XML_SetCharacterDataHandler(prs, text); while (! feof(doc)) { int cnt; void *buff = XML_GetBuffer(prs, BUFSIZE); if (! buff) { fprintf(stderr, "Ran out of memory\n"); exit(-1); } cnt = fread(buff, 1, BUFSIZE, doc); stat = XML_ParseBuffer(prs, cnt, 0); if (! stat) { fprintf(stderr, "Parse error at line %d, column %d\n", XML_GetCurrentLineNumber(prs), XML_GetCurrentColumnNumber(prs)); exit(-1); } } fclose(doc); stat = XML_ParseBuffer(prs, 0, 1); if (! stat) { fprintf(stderr, "Parse error at line %d, column %d\n", XML_GetCurrentLineNumber(prs), XML_GetCurrentColumnNumber(prs)); exit(-1); } } /* End main */ -- Clark Cooper Logic Technologies,Inc cccooper@ltionline.com (518) 388-7451 650 Franklin St., Suite 304 coopercc@netheaven.com Schenectady, NY 12305 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Tue Mar 2 23:29:42 1999 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:09:36 2004 Subject: XML and special Characters : unicode v3.0 ? In-Reply-To: <36DAE5FA.5BA2D70E@locke.ccil.org> Message-ID: <000d01be64fc$1a3a09e0$0dce6ccb@baden> Tim Bray writes: > > In practice, > > I've never actually seen anything outside of the BMP, but the > > experts agree they're showing up real soon now. John Cowan writes: > Not until Unicode 4.0, unless someone wants to use the private-use > planes 15 and 16. Uh, that's gonna be a problem. How would you put in a PUA character in an XML doc ? Still by the U+... ? (we have around 800 of them for the languages we work with !!) Baden xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From falk at icon.at Tue Mar 2 23:34:08 1999 From: falk at icon.at (Falk, Alexander) Date: Mon Jun 7 17:09:36 2004 Subject: Please send non-English XML example documents Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: Falk, Alexander.vcf Type: application/octet-stream Size: 1062 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/dff30928/FalkAlexander.obj From msabin at cromwellmedia.co.uk Wed Mar 3 12:12:30 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... Message-ID: David Brownell wrote, > Put it this way: if you assume UTF-16, you're > safe either way because UTF-16 is a superset. Err ... is that true? Maybe I'm being a bit obsessive about my interpretation of the various standards docs, but as far as I can see UCS-2 isn't a subset of UTF-16. The BMP S-zone codes (D800-DFFF) are undefined but reserved in UCS-2, and so should not occur in a purportedly UCS-2 stream. I would expect a processor which encountered such codes to either, 1. Spit out an error and give up. or, 2. Quietly ignore them and continue processing with the next 2 octets. Obviously these codes are defined and legal in UTF-16, so an incorrect assumption of UTF-16 when the stream was in fact broken UCS-2 would produce unpredictably incorrect behaviour (ie. the processor might continue processing a broken doc in an indeterminate way). In any case, on a less finickety note, I'd quite like to be able to compute string lengths UCS-2 style where that's appropriate, because 2*byte- length is a bit simpler than the UTF-16 equivalent ;-) Anyway, here's a slightly updated version of a proposal I mailed to Tim Bray yesterday ... In the absence of an appropriate MIME header the octet sequences, 1. FE FF 2. FF FE 3. 00 3C 00 3F 4. 3C 00 3F 00 may be inferred to be, 1. big-endian indeterminately encoded 2 octet characters. 2. little-endian indeterminately encoded 2 octet characters. 3. BOM-less big-endian indeterminately encoded 2 octet characters. 4. BOM-less little-endian indeterminately encoded 2 octet characters. If either of the following PIs are found, or, in cases (1) and (2), if *no* PI is found, then encoding is resolved to UTF-16. Otherwise if, is found then encoding is resolved to UCS-2. This very complicated and isn't a zillion miles away from the current handling of UTF-8 vs. ISO 8859-x vs. US-ASCII. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Mar 3 12:45:45 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... Message-ID: Sorry to follow up my own posting, but one thing needs a bit of clarification, and one typo needs correction. I wrote, > David Brownell wrote, > > Put it this way: if you assume UTF-16, you're > > safe either way because UTF-16 is a superset. > > Err ... is that true? > > Maybe I'm being a bit obsessive about my > interpretation of the various standards docs, but > as far as I can see UCS-2 isn't a subset of > UTF-16. The question of UCS-2 being, or not being a subset of UTF-16 is a bit of a red herring. It is undoubtedly true that the set of octet pairs which are legal UCS-2 characters is a subset of the set of octet pairs which are legal UTF-16 characters. Appendix F suggests that octet sequences which could equally well be interpreted as UTF-16 or UCS-2 may be assumed to be UTF-16, and *doesn't* include a clause stating that this assumption should be revised in the light of an explicit XML encoding declaration. I think that clause should be added, in much the same way as it is for UTF-8 vs. 8859-X. Now the typo ... > This very complicated and isn't a zillion miles away > from the current handling of UTF-8 vs. ISO 8859-x > vs. US-ASCII. Please insert the word 'isn't' in the obvious place ;-) Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 3 13:30:59 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:36 2004 Subject: SAX and DTDHandler Message-ID: <9f7499ae.36dd3931@aol.com> Hi Everyone, I've been playing around with SAX and several of the parser implementations (primarily Sun's and IBM's). The basics of DocumentHandler and ErrorHandler are straight forward and work well. The interfaces EntityResolver and DTDHandler are still fuzzy. I've searched for documents on these but have not found anything of any depth. My primary question is will SAX allow me to parse a DTD? It doesn't seem so. DTDHandler only handles unparsed Entity declarations (like binary data) and Notation declarations. If it is the case that SAX does not parse DTDs due to the fact that it does not want to perform validation then why bother with the above two cases? I guess I don't understand the design philosophy in these respects. All help is appreciated. Thanks, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Mar 3 13:33:19 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:36 2004 Subject: DTD for Bibliographic Notation In-Reply-To: Message-ID: Has anybody written a DTD for bibliographies? Are there any standards efforts in this area? To be usable, this DTD would have to be public domain or explicitly allow unrestricted reuse. I probably don't need to modify it, but at a minimum I need to be able to republish it. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Wed Mar 3 14:00:44 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Sun, 28 Feb 1999 22:02:37 +0100. <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> Message-ID: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> --------- > There's a very nice document at: > > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm > > Cheers, > Jarle Stabell Thanks, Jarle! Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd (copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6 (Solaris 2). I already parsed xml instances without problems but in this case it doesn't work. Following are the first lines of nsgmls output: sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:19:E: "X201C" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:19:E: "X201D" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:101:9:E: document type does not allow element "ABSTRACT" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:142:8:E: document type does not allow element "PUBSTMT" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:146:11:E: document type does not allow element "SOURCEDESC" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:149:10:E: document type does not allow element "LANGUSAGE" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:153:13:E: document type does not allow element "REVISIONDESC" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:189:36:W: character "<" is the first character of a delimiter but occurred as data /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:371:8:E: end tag for "HEADER" which is not finished /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:787:7:W: character "<" is the first character of a delimiter but occurred as data I enabled XML support as described on http://www.jclark.com/sp/xml.htm Set the SP_CHARSET_FIXED environment variable to YES. Set the SP_ENCODING environment variable to XML. Set the SGML_CATALOG_FILES environment variable to point to the file pubtext/xml.soc. Use the -wxml option. setenv SP_CHARSET_FIXED YES What's wrong? Any help is appreciated. Thanks in advance. Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 15:10:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: I wonder ... References: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> Message-ID: <36DD50B5.5904B0A6@locke.ccil.org> Jeffrey E. Sussna wrote: > This works fine, but (at least in IE 5) only for a single level. That > is, you can't have another entity reference inside "book.dtd". To me, > this significantly limits its usefulness (imagine not allowing a > #include inside a file that was #included). If so, that is a dreadful bug. The XML specification has no such limitations, although one might suppose that an implementation might have a practical limit in the neighborhood of 50-100, because of operating system limits on open files. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 15:17:20 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000d01be64fc$1a3a09e0$0dce6ccb@baden> Message-ID: <36DD523B.F2EAFB7E@locke.ccil.org> Baden Hughes wrote: > Uh, that's gonna be a problem. How would you put in a PUA character in an > XML doc ? Still by the U+... ? (we have around 800 of them for the languages > we work with !!) Well, first of all there are 6400 private-use characters on the BMP, so that gives you plenty of room to play with. You cannot use any kind of private-use character in element or attribute names, which is good for interoperability; to incorporate them in character data or attribute values, use a character reference like . What will be more serious is that *normative* characters from the Astral Planes aren't usable in XML names either. Presumably, when they actually show up, XML will be modified, so that we can have element names in Egyptian hieroglyphics with attributes in Sindarin. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Mar 3 15:18:33 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> > Elliotte Rusty Harold writes: >Has anybody written a DTD for bibliographies? Have you looked at the bibliography module of DocBook? DocBook home page: http://www.oasis-open.org/docbook XML version of DocBook: http://www.nwalsh.com/docbook/xml file with XML's bibliography module: http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From prb at uic.edu Wed Mar 3 15:52:40 1999 From: prb at uic.edu (Paul R. Brown) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation Message-ID: <003701be658b$81c84d80$e7b2c183@razzmatazz.math.uic.edu> The folks who built bibtex have already spent some time on this, so you could use portions of their design. - Paul -----Original Message----- From: Elliotte Rusty Harold Date: Wednesday, March 03, 1999 9:25 AM Subject: DTD for Bibliographic Notation >Has anybody written a DTD for bibliographies? Are there any standards >efforts in this area? To be usable, this DTD would have to be public >domain or explicitly allow unrestricted reuse. I probably don't need to >modify it, but at a minimum I need to be able to republish it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Mar 3 16:19:39 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> Message-ID: At 10:24 AM -0500 3/3/99, DuCharme, Robert wrote: >> Elliotte Rusty Harold writes: >>Has anybody written a DTD for bibliographies? > >Have you looked at the bibliography module of DocBook? > No, but I'll check it out. Thanks. > DocBook home page: http://www.oasis-open.org/docbook > XML version of DocBook: http://www.nwalsh.com/docbook/xml > file with XML's bibliography module: >http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod > >Bob DuCharme www.snee.com/bob snee.com> see www.snee.com/bob/xmlann for "XML: >The Annotated Specification" from Prentice Hall. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 3 16:20:22 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:37 2004 Subject: SAX and DTDHandler In-Reply-To: <9f7499ae.36dd3931@aol.com> References: <9f7499ae.36dd3931@aol.com> Message-ID: <14045.24597.828439.227541@localhost.localdomain> MikeDacon@aol.com writes: > My primary question is will SAX allow me to parse a DTD? It > doesn't seem so. DTDHandler only handles unparsed Entity > declarations (like binary data) and Notation declarations. If it > is the case that SAX does not parse DTDs due to the fact that it > does not want to perform validation then why bother with the above > two cases? SAX doesn't parse anything -- it's just an interface. Some (most?) Java-based XML parsers that implement the SAX interface do happen to perform validation, but that's outside the scope of SAX 1.0 itself (we're talking about fixing that for ModSAX). SAX 1.0 provides the DTDHandler interface because XML 1.0 requires processors to report notations and unparsed entities. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 16:28:16 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: SAX and DTDHandler References: <9f7499ae.36dd3931@aol.com> Message-ID: <36DD62E3.785602E1@locke.ccil.org> MikeDacon@aol.com wrote: > My primary question is will SAX allow me to parse a DTD? > It doesn't seem so. DTDHandler only handles unparsed Entity declarations > (like binary data) and Notation declarations. If it is the case that SAX does > not > parse DTDs due to the fact that it does not want to perform validation then > why bother with the above two cases? Remember that SAX is a front-end to various parsers with various philosophies, validating (XML4J), non-validating but external-entity- reading (Aelfred), non-validating and document-entity-only (XP). SAX provides methods, for parsers that wish to do so, to report on declared notations and unparsed entities, since these features provide actual extensions to the basic element/attribute model. Element and attribute list declarations cannot be reported through SAX, since they are reckoned inessential. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 16:29:57 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd References: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> Message-ID: <36DD6335.4C40B6F5@locke.ccil.org> Stefan Mintert wrote: > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > (copied from the above URL) with nsgmls. As the documentation for XMLspec warns, the current version of the DTD is *not* the one used with the XML Recommendation, which used a much older version. So don't do that. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Wed Mar 3 17:14:19 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Wed, 03 Mar 1999 11:28:37 -0500. <36DD6335.4C40B6F5@locke.ccil.org> Message-ID: <199903031713.SAA24548@brown.informatik.uni-dortmund.de> > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > > (copied from the above URL) with nsgmls. > > As the documentation for XMLspec warns, the current version of the > DTD is *not* the one used with the XML Recommendation, which used > a much older version. So don't do that. ooops, sorry; but that doesn't explain the parsing errors concerning the DTD: spec.dtd:60:17:W: named character reference spec.dtd:60:19:E: "X2014" is not a function name [...] BTW: I would be nice to use the XML spec as a valid document, not just a well-formed document. Has anyone kept the old XMLspec DTD? (I guess it's Revision 1.0, 7 April 1998) Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Wed Mar 3 17:58:23 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation In-Reply-To: References: Message-ID: <3.0.5.32.19990303094553.0097e990@library.berkeley.edu> At 08:26 AM 3/3/1999 -0500, Elliotte Rusty Harold wrote: >Has anybody written a DTD for bibliographies? Are there any standards >efforts in this area? To be usable, this DTD would have to be public >domain or explicitly allow unrestricted reuse. I probably don't need to >modify it, but at a minimum I need to be able to republish it. > Mm, not to be Clinton-esque or anything, but it depends on what you mean by bibliographies. There are an awful lot of DTDs that include elements for bibliographic citation as part of a larger document structure. Some of the better known examples would include the and elements with the TEI DTD, the element within ETD-ML DTD (part of the Electronic Thesis and Dissertation project at Virginia Tech), the element with the Encoded Archival Description DTD, and the element in DocBook. There are standalone DTDs for capturing bibliographic information, but they tend to be written by library geeks like me, and as a result, tend to be a bit more detailed and extensive (read arcane and opaque) than what most people would think of when designing a DTD for bibliographies. The most authoritative work in these lines would probably be the MARC DTDs provided by the Library of Congress (http://lcweb.loc.gov/marc/marcsgml.html), but understanding those without copies of both the USMARC standard and the Anglo-American Cataloguing Rules next to you is a non-trivial task. If you want to look over a simpler version of the MARC standard as an XML DTD, I revised an SGML DTD that I did for MARC which you can grab at http://sunsite.berkeley.edu/~jmcdonou/USMARC.XML.DTD; again, knowledge of the MARC standard is a big help on making heads or tails of the DTD, but , , , and comprise most of what people think of as basic bibliographic information. If you're thinking that having all these different ways of encoding bibliographic information is a headache waiting for those wanting to automate processing of bibliographic data from multiple sources, you're right. But I don't think there's any way out of that one. The needs of those doing markup of bibliographic information vary quite a bit depending on whether we're talking scholars reporting on their research, librarians, publishers, students at various levels, etc. Mapping between multiple forms of marked up bibliographic data is something we're just going to have to live with. I try to think of it as yet another clause in the text-encoding programmers' full employment act. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Wed Mar 3 18:07:59 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Stefan Mintert's message of Wed, 03 Mar 1999 18:13:45 +0100 Message-ID: <199903031807.SAA03605@stevenson.cogsci.ed.ac.uk> > spec.dtd:60:17:W: named character reference > spec.dtd:60:19:E: "X2014" is not a function name Looks like it's not recognising XML-style character references - presumably the line is Are you using a version of nsgmls that knows about XML? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Wed Mar 3 18:12:25 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:37 2004 Subject: Publication of first WD of the W3C XML Fragment Interchange Rec Message-ID: <3.0.32.19990303121005.00decde8@pophost.arbortext.com> The W3C XML Fragment WG [1] has just published its first Working Draft of the XML Fragment Interchange Recommendation [2]. Its abstract reads: The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient. The XML Fragment WG is chartered with defining a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. This document defines Version 1.0 of the [eventual] W3C Recommendation that addresses this issue. Interested parties are invited to review the specification and report implementation experience. As indicated in the document, comments should be sent to [3], (a publicly archived [4] list). Comments received by 1999 March 26 will be considered for a revision soon after. All comments will be considered in light of the XML Fragment Requirements Document [5]. In particular, basic scope issues and design decisions will be reconsidered only when grave and previously unrecognized flaws are uncovered. Requests for enhancement will typically be deferred for later versions of the specification under development unless the enhancement is uncontroversial and its incorporation would not materially delay production of the specification. Paul Grosso XML Fragment WG Chair Daniel Veillard W3C Staff Contact [1] http://www.w3.org/XML/Activity.html#fragment-wg [2] http://www.w3.org/TR/WD-xml-fragment [3] mailto:www-xml-fragment-comments@w3.org [4] http://lists.w3.org/Archives/Public/www-xml-fragment-comments/ [5] http://www.w3.org/TR/NOTE-XML-FRAG-REQ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Wed Mar 3 18:30:15 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd Message-ID: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com> At 18:13 1999 03 03 +0100, Stefan Mintert wrote: > > > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm > > > > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > > > (copied from the above URL) with nsgmls. > > > > As the documentation for XMLspec warns, the current version of the > > DTD is *not* the one used with the XML Recommendation, which used > > a much older version. So don't do that. > >ooops, sorry; but that doesn't explain the parsing errors concerning the DTD: > >spec.dtd:60:17:W: named character reference >spec.dtd:60:19:E: "X2014" is not a function name >[...] 1. nsgmls is not an XML parser. those errors are probably because it's not recognizing — (the hex version) as a numeric character reference. You might try converting X2014 to a decimal number and seeing what happens. Or, use an XML parser. 2. The URL quoted above is old. The latest are: DTD: http://www.w3.org/XML/1998/06/xmlspec-19990205.dtd Documentation: http://www.w3.org/XML/1998/06/xmlspec-report-19990205.htm paul xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tgraham at mulberrytech.com Wed Mar 3 19:31:20 1999 From: tgraham at mulberrytech.com (Tony Graham) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> References: <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> <199903031400.PAA23631@brown.informatik.uni-dortmund.de> Message-ID: At 3 Mar 1999 15:00 +0100, Stefan Mintert wrote: > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > (copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6 > (Solaris 2). I already parsed xml instances without problems but in this case > it doesn't work. Following are the first lines of nsgmls output: > > sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls reads the xml.soc catalog that tells it to use the SGML Declaration for XML, xml.dcl. That SGML Declaration tells nsgmls what hexadecimal character references look like. Without it, things like &x2014; are being interpreted as per ISO 8879:1986, which isn't doing you or the parser any good. Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 19:55:48 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: Message-ID: <36DD9263.F26D063C@eng.sun.com> > > > Put it this way: if you assume UTF-16, you're > > > safe either way because UTF-16 is a superset. > > > > Err ... is that true? > > > > Maybe I'm being a bit obsessive about my > > interpretation of the various standards docs, Given how many folk talk about UCS-2 lately (not many!) that could well be true ... ;-) > > but > > as far as I can see UCS-2 isn't a subset of > > UTF-16. > > The question of UCS-2 being, or not being a subset of > UTF-16 is a bit of a red herring. It is undoubtedly true > that the set of octet pairs which are legal UCS-2 > characters is a subset of the set of octet pairs which > are legal UTF-16 characters. And more to the point, XML processors aren't required to report such low level character encoding errors ... this would be one. > Appendix F suggests that octet sequences which could > equally well be interpreted as UTF-16 or UCS-2 may be > assumed to be UTF-16, and *doesn't* include a clause > stating that this assumption should be revised in > the light of an explicit XML encoding declaration. I > think that clause should be added, in much the same > way as it is for UTF-8 vs. 8859-X. All of appendix F is non-normative; you're free to revise or not, as you see fit, and it won't affect conformance. - Dave > Now the typo ... > > > This very complicated and isn't a zillion miles away > > from the current handling of UTF-8 vs. ISO 8859-x > > vs. US-ASCII. > > Please insert the word 'isn't' in the obvious > place ;-) > > Cheers, > > Miles > > -- > Miles Sabin Cromwell Media > Internet Systems Architect 5/6 Glenthorne Mews > +44 (0)181 410 2230 London, W6 0LJ > msabin@cromwellmedia.co.uk England > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Mar 3 20:32:53 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> Message-ID: <36DD9C3C.99DD529C@goon.stg.brown.edu> David Brownell wrote: > And more to the point, XML processors aren't required > to report such low level character encoding errors ... > this would be one. On the face of things, this doesn't make sense. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 20:53:39 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd References: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com> Message-ID: <36DDA11E.C07163B0@locke.ccil.org> Paul Grosso wrote: > 1. nsgmls is not an XML parser. The version included with SP 1.3 is an XML parser, though not entirely defect-free. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 20:59:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000d01be64fc$1a3a09e0$0dce6ccb@baden> <36DD523B.F2EAFB7E@locke.ccil.org> <36DD92EC.80B3B3DE@eng.sun.com> Message-ID: <36DDA2A7.8E2F1E01@locke.ccil.org> David Brownell wrote: > Surely it's more important that Klingon markup be supported? :-) All Languages Are Equal (TM). > I notice that a recent Linux distribution puts Klingon support > into a chunk of private use area, so at least there's consistency > that XML doesn't yet offer complete Klingon support! Right. Support for private-use characters in XML names will always be a Bad Thing, because nobody outside the private user can tell which characters are letters and which aren't, so it's either all or none, and "none" is the most sensible choice. Just be prepared to revisit XML so that Unicode 3.0 name and name-start characters can get included. This will allow the creation of DTDs written in serious Real World languages like Macedonian, Syriac, Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native languages, Khmer, Mongolian, and Yi. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 21:04:29 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu> Message-ID: <36DDA39D.EA98C73B@locke.ccil.org> Richard L. Goerwitz wrote: > On the face of things, this doesn't make sense. For example, a document containing P and otherwise error-free may be processed without error, although U+0080 is not a legal Unicode character. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 21:06:07 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Java Specification Request for XML Message-ID: <36DD9EA1.2CEE7CEA@eng.sun.com> There seems to have been some confusion regarding what Sun is trying to do with its Java Specification Request for an XML Extension to the Java Platform. A Java Specification Request (JSR) is a request to develop a specification; it is not a specification in itself. What we did a week ago is ask for comments regarding this proposal to begin work on such an XML Extension specification. If this is approved, we will then follow the Java Community Process as described at http://developer.java.sun.com/developer/jcp/ to actually develop that specification. The Java Community Process is an open, inclusive process and we look forward to the active particpation of all interested parties. The process goes forward in several steps: [1] The JSR is presented for comment (as you've seen) [2] The JSR is approved (we hope) [3] An expert group is formed to write the specification; this begins with a "Call for Experts" (CAFE) to participate. [4] The expert group writes a first draft of the specification [5] The draft is circulated to all Java technology licensees and Participants in the Java Community Process. [6] Comments are collected, read, and responded to by the expert group, resulting in an improved specification. [7] The refined specification is then released to the public for comment. [8] Comments from the public are collected, read, and responded to by the expert group, resulting in more refinements. [9] The final specification is produced by the expert group, along with a reference implementation and compatibility tests. The key point is that everyone with internet access will get a chance to review and comment on the emerging specification. Note that the xml-dev community has already had input into the proposed specification as evidenced by the referencing of the SAX specification in the JSR as one of the starting documents. Other specifications could be adopted by the expert group. We look forward to the continued participation of the xml-dev community in this work. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 21:38:27 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu> Message-ID: <36DDAA6F.D432053A@eng.sun.com> "Richard L. Goerwitz" wrote: > > David Brownell wrote: > > > And more to the point, XML processors aren't required > > to report such low level character encoding errors ... > > this would be one. > > On the face of things, this doesn't make sense. For example, character encodings are typically handled many layers below the XML processor. That processor shouldn't be faulted for behaviors of the underlying processor. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Mar 3 21:52:01 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... Message-ID: David Brownell wrote, > "Richard L. Goerwitz" wrote: > > David Brownell wrote: > > > And more to the point, XML processors aren't > > > required to report such low level character > > > encoding errors ... this would be one. > > > > On the face of things, this doesn't make sense. > > For example, character encodings are typically handled > many layers below the XML processor. That processor > shouldn't be faulted for behaviors of the underlying > processor. Most of the time yes ... but remember we're discussing the interaction between encoding detection and encoding _declarations_. An XML processor has to have some involvement in that. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 22:03:28 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: Message-ID: <36DDB059.550023E7@eng.sun.com> Miles Sabin wrote: > > David Brownell wrote, > > "Richard L. Goerwitz" wrote: > > > David Brownell wrote: > > > > And more to the point, XML processors aren't > > > > required to report such low level character > > > > encoding errors ... this would be one. > > > > > > On the face of things, this doesn't make sense. > > > > For example, character encodings are typically handled > > many layers below the XML processor. That processor > > shouldn't be faulted for behaviors of the underlying > > processor. > > Most of the time yes ... but remember we're discussing > the interaction between encoding detection and encoding > _declarations_. An XML processor has to have some > involvement in that. But the error in question would show up after the encoding declaration had been processed -- well after! -- so the XML processor itself would no longer need involvement. The non-normative "detection" can't involve the error ... surrogates can't appear within encoding declarations. In any case, it's OK for conformant processors to reject UCS-2 out of hand, eliminating all possibility of such an error in any case! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Mar 3 22:24:41 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:38 2004 Subject: Java Specification Request for XML In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com> Message-ID: <199903032224.RAA10719@hesketh.net> At 12:42 PM 3/3/99 -0800, David Brownell wrote: >There seems to have been some confusion regarding what Sun is trying >to do with its Java Specification Request for an XML Extension to the >Java Platform. > >[...] > >The Java Community Process is an open, inclusive process and we >look forward to the active particpation of all interested parties. > >[...detailed list of process steps, excerpted..] > >[4] The expert group writes a first draft of the specification >[5] The draft is circulated to all Java technology licensees and > Participants in the Java Community Process. >[7] The refined specification is then released to the public for > comment. > >The key point is that everyone with internet access will get a >chance to review and comment on the emerging specification. > >Note that the xml-dev community has already had input into the >proposed specification as evidenced by the referencing of the >SAX specification in the JSR as one of the starting documents. >Other specifications could be adopted by the expert group. > >We look forward to the continued participation of the xml-dev >community in this work. This all sounds good, but I remain concerned (and wary) for a number of reasons, and I didn't respond directly to your JSR commenting process because I'm very uncertain about whether this development belongs in a process controlled, however lightly, by a particular vendor. The JCP is only a partially open process, as the sequence of steps above - in which Java technology licensees and 'Participants in the Java Community Process' is step 5 and the public is step 7 - demonstrates. It seems that the licensees and 'official' participants are still privileged, have earlier access to the information, and potentially more impact on its shape. I don't expect to be one of the experts crafting the standard, but I hope to able to participate in the discussions as a real participant and not just another spectator. Given that SAX was developed (and is still developing) in a very open forum, it seems like the JCP is moving into an area that was totally open and moving it to an arena that is _less_ open. There have been a lot of criticisms of W3C process on this list, as I'm sure you've noticed, for similar openness problems. While the W3C does in some way respond to public comments, there's no transparency - we have no way to know how much they care. I'd like to hear Sun make some _strong_ statements that they'll be developing this API in a way more like the SAX process than the DOM process, and that genuine transparency is the goal of the JCP rather than Sun protecting what it sees as its interests in the Java/XML space. I think Sun could make a great contribution here, using its weight in the Java community to help standardize XML processing and make it more universally used, but I hope Sun isn't planning to use that weight to direct the discussion and influence the final decisions unduly. It's promising, but I think there are a lot of folks out here who are very wary. (See Elliotte Rusty Harold's comments at http://metalab.unc.edu/xml for an example.) I'm definitely wary, though I also have some real hopes. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mscardin at us.oracle.com Wed Mar 3 22:50:00 1999 From: mscardin at us.oracle.com (Mark Scardina) Date: Mon Jun 7 17:09:38 2004 Subject: ANN: Oracle XML Class Generator for Java Message-ID: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com> I would like to announce Oracle's second XML component beta release - XML Class Generator for Java - now available for downloading and testing on the Oracle Technology Network (OTN) XML site located at http://technet.oracle.com. The XML Class Generator will generate a set of Java source files based on an input DTD. The generated Java source files can then be used to construct, optionally validate, and print a XML document that is compliant to the DTD specified. This is an early beta release and has the following features: * Creates Java Classes from DTDs to enable the programmatic construction of XML documents. * Supports validation mode to assist debugging. * Works with the Oracle XML Parser in Java. * Creates documents conforming to the W3C XML 1.0 Recommendation. * Supports creating documents in the following encodings: UTF-8 UTF-16 ISO-10646-UCS-2 ISO-10646-UCS-4 US-ASCII EBCDIC-CP-US ISO-8859-1 Shift_SJIS Support is available in the XML Forum on OTN to provide a collaborative area for bug reporting, technical support, and discussing other Oracle/XML issues. This forum will be used for external as well as internal beta testers. Mark V. Scardina Sr. Product Manager - Core Development Server Technologies - Oracle Corporation Oracle XML News http://www.oracle.com/xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mscardin at us.oracle.com Wed Mar 3 23:00:19 1999 From: mscardin at us.oracle.com (Mark Scardina) Date: Mon Jun 7 17:09:38 2004 Subject: ANN: Oracle XML Parser for Java - Preoduction Release Message-ID: <001801be65c9$6513c960$47be1990@mscardin-pc.us.oracle.com> The production release of the Oracle XML Parser for Java is available for download at http://technet.oracle.com/tech/xml. Supports validation and non-validation modes Built-in Error Recovery until fatal error. Supports W3C XML 1.0 Recommendation. Intergrated Document Object Model (DOM) Level 1.0 API Integrated SAX 1.0 API Supports W3C Proposed Recomendation for XML Namespaces Supports documents in the following encodings: UTF-8 BIG 5 UTF-16 GB2312 ISO-10646-UCS-2 EUC-JP ISO-10646-UCS-4 EUC-KR US-ASCII KOI8-R EBCDIC-CP-* ISO-2022-JP ISO-8859-1to -9 ISO-2022-KR Shift_JIS Support is available in the XML Forum on OTN to provide a collaborative area for bug reporting, technical support, and discussing other Oracle/XML issues. This forum will be used for external as well as internal beta testers. Mark V. Scardina Sr. Product Manager - Core Development Server Technologies - Oracle Corporation Oracle XML News http://www.oracle.com/xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dante at mstirling.gsfc.nasa.gov Thu Mar 4 15:48:21 1999 From: dante at mstirling.gsfc.nasa.gov (Dante Lee) Date: Mon Jun 7 17:09:38 2004 Subject: HTML Question Message-ID: Can someone look at the source of my web page and tell me why my links are not coming up in the targeted frames? The site is at: http://mstirling.gsfc.nasa.gov/~dante/sharp98 All of the links are targeted to Frame 1, which is specified in the index frame as the frame to the right. However, all of the links pop up as new windows. Please help. I think it has something to do with the javascript in Frame1 (titlebox.html). Thanx. Dante M. Lee Code 588 NASA/GSFC Greenbelt MD 20771 Voice = 301-521-1077 Bldg = 23 Rm = W415 Email = dante@mstirling.gsfc.nasa.gov dante4@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dalapeyre at mulberrytech.com Thu Mar 4 19:53:04 1999 From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre) Date: Mon Jun 7 17:09:38 2004 Subject: DTD for Bibliographic Notation In-Reply-To: References: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> Message-ID: The journal publishers have taken a cut at bibliographies, for small example: Elsevier's is available on their website (at least it used to be) John Wiley & Sons has one (See WILEY Interscience) CADMUS used to have theirs on their website Ovid has made theirs public as well (but I would not recommend it) PUBMED at NIH/NLM also has a very basic but nice subset (definitely available on their website). --Debbie ====================================================================== Deborah Aleyne Lapeyre mailto:dalapeyre@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9633 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 4 20:15:31 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca> At 03:59 PM 3/3/99 -0500, John Cowan wrote: > This will allow the creation of >DTDs written in serious Real World languages like Macedonian, Syriac, >Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native >languages, Khmer, Mongolian, and Yi. John, this is unfair. All the Macedonians and Sinhalese I've known have an excellent sense of humor. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jpetit at 4thworldtele.com Thu Mar 4 22:05:34 1999 From: jpetit at 4thworldtele.com (John Petit) Date: Mon Jun 7 17:09:39 2004 Subject: XSL Pre-processing Message-ID: <36DEA0E1.1EA10C7E@4thworldtele.com> Is there any software out there that will allow me to do server side XSL preprocessing of XML documents into HTML for display? This is independent of the user's browser. -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 368 bytes Desc: Card for John Petit Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990304/7bd50a3e/vcard.vcf From donpark at quake.net Thu Mar 4 22:06:46 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <00ac01be668b$3abc9f80$2ee044c6@arcot-main> The first draft of the XML Fragment spec allows only one Fragbody. Could someone from the WG shed some light on why this constraint is important? Multi-fragment packages are useful in many situations such as query result representation. Although it is possible to define a packaging mechanism that handles multiple fragments, a fragment context information (FCI) must be provided for each fragment because the spec does not allow FCI to be shared by multiple fragments. A possible example of a multi-fragment package follows: J. R. R. Tolkien The Book of Lost Tales (The History of Middle-Earth) Mass Market Paperback Reprint edition (June 1992) 0345375211 4.79 1 J. R. R. Tolkien The Book of Lost Tales (The History of Middle-Earth) Mass Market Paperback Reprint edition (June 1992) 0345375211 4.79 1 Comments? Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 4 22:19:54 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:39 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca> Message-ID: <36DF06D2.F214E9F9@locke.ccil.org> Tim Bray wrote: > At 03:59 PM 3/3/99 -0500, John Cowan wrote: > > This will allow the creation of > >DTDs written in serious Real World languages like Macedonian, Syriac, > >Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native > >languages, Khmer, Mongolian, and Yi. > > John, this is unfair. All the Macedonians and Sinhalese I've known > have an excellent sense of humor. -Tim Well, several people have believed this was sarcasm on my part. Not so. When I said "serious Real World languages" I meant it. Real people speak, understand, read, and write them in the course of their day-to-day lives. Ancient Egyptian and Sindarin don't fall into this category, no matter that I am an enthusiast of both. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 4 23:00:30 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:39 2004 Subject: Unicode conformance, short version References: <3.0.32.19990301212757.00a2e5b0@pop.intergate.bc.ca> <36DC0627.F2491FA8@locke.ccil.org> <36DC67FB.26E0@w3.org> <199903041310.WAA18593@sh.w3.mag.keio.ac.jp> <14046.38677.746315.899329@localhost.localdomain> Message-ID: <36DF1059.76F2CC4E@locke.ccil.org> Unicode folks have seen this, but XML folks haven't. Here's John's Own Version Of Unicode Conformance: 1) Unicode characters are 16 bits long; deal with it. 2) Byte order is only an issue in files. 3) If you don't have a clue, assume big-endian. 4) Loose surrogates don't mean jack. 5) Neither do U+FFFE and U+FFFF (a.k.a. the zigamorph). 6) Leave the unassigned codepoints alone. 7) It's OK to be ignorant about a character, but not plain wrong. 8) Subsets are strictly up to you. 9) Canonical equivalence matters. 10) Don't garble what you don't understand. This is presented in the hope that it may be useful, but all warranties (including implicit warranties of merchantability or fitness for a particular purpose) are void. Freely reusable, except that John Cowan asserts the moral right to be known as author. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Thu Mar 4 23:03:03 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:39 2004 Subject: I wonder ... In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> (jes@kuantech.com) Message-ID: <199903042301.SAA01051@ruby.ora.com> [Jeffrey E. Sussna] > This works fine, but (at least in IE 5) only for a single > level. That = is, you can't have another entity reference inside > "book.dtd". To me, = this significantly limits its usefulness > (imagine not allowing a = #include inside a file that was > #included). IE 5 has its parsing errors, but this is not one of them. Error messages I've seen when parsing DocBook indicate that it is definitely following references to multiple levels. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 4 23:16:40 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: Unicode conformance, short version Message-ID: <3.0.32.19990304151441.00c17b10@pop.intergate.bc.ca> At 05:59 PM 3/4/99 -0500, John Cowan wrote: >4) Loose surrogates don't mean jack. There's reason to believe they mean severe breakage upstream, and in mission-critical apps are probably grounds to halt and catch fire. Anyhow, if you're reading a character stream and one of 'em has a value between (decimal) 55296 and 57343 inclusive, it ain't XML any longer. (And I believe all the serious XML processors actually enforce this particular rule). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Thu Mar 4 23:26:39 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <3.0.32.19990304172546.00f10224@pophost.arbortext.com> At 14:06 1999 03 04 -0800, Don Park wrote: >The first draft of the XML Fragment spec allows only one Fragbody. Could >someone from the WG shed some light on why this constraint is important? First, let me remind folks that only comments sent to the archived mail list set up for comments are "officially" considered. The WG cannot promise to honor all requests for responses to questions posted on xml-dev. However, the answer to Don's question will probably address a lot of other questions, and the WG did consider it carefully, so I would like to answer that here. One of the key principals in developing this version of the Fragment Interchange spec was to define and remain within a limited scope. The problem was (1) to define what fragment context information is, (2) to define a fragment context specification notation, and (3) to define at least one interoperable method for associating a fragment context specification with a fragment body. Although we did decide to address point (3) by defining a simple "packaging" scheme, we were very careful to do the minimum necessary to address point (3). Specifically, we did not want to enlarge our scope to include packaging methods in general. It is expected that the XML Activity of the W3C will consider ways to address packaging in the near future, and the XML Fragment WG didn't want to do something that might later constrain a more general solution. Packaging multiple entities in a single unit is likely to be a useful thing to do in general of which packaging multiple fragment bodies is just one example. The WG didn't want to define a way to address multiple fragment bodies and then discover, when the more general problem is carefully considered, that our solution wasn't a subset of the solution to the more general problem. In summary, the WG is aware of lots of improvements, enhancements, and extensions that could be made to an XML Fragment Interchange spec, but we ruthlessly kept ourselves to the "minimum needed to declare victory." We expect work on Schemas and Packaging and XLink and probably other areas will all contribute technology that would be useful in a version 2 XML Fragment Interchange spec someday, but we believe that implementation and user experience should prove the version 1 spec useful before we even think about a version 2. Of course, if you seriously believe that the spec is useless unless it allows multiple fragment bodies per package, then that is a comment you should make and attempt to support. We don't want to come out with a spec folks think is useless, but we were trying to keep it as minimal as possible while still addressing the problem we defined as our scope. paul Paul Grosso Chair, XML Fragment WG xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 5 00:27:45 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:39 2004 Subject: XSL Pre-processing Message-ID: <005901be669e$efbf2520$0300000a@othniel.cygnus.uwa.edu.au> I use James Clark's XT. For a more complete list see http://www.xmlsoftware.com/xsl/ For examples of XSL I use to produce the above site, see http://www.xmlsoftware.com/articles/xsl-by-example.html James -----Original Message----- From: John Petit >Is there any software out there that will allow me to do server side XSL >preprocessing of XML documents into HTML for display? This is >independent of the user's browser. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Mar 5 00:37:58 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> Paul, >First, let me remind folks that only comments sent to the archived >mail list set up for comments are "officially" considered. The WG >cannot promise to honor all requests for responses to questions posted >on xml-dev. Sorry about that. I couldn't find the e-mail address of the mailing list (W3C site was down) when I sent my message so had to punt into xml-dev. >Of course, if you seriously believe that the spec is useless unless it >allows multiple fragment bodies per package, then that is a comment you >should make and attempt to support. We don't want to come out with a >spec folks think is useless, but we were trying to keep it as minimal >as possible while still addressing the problem we defined as our scope. I found the spec very useful, timely, and clear. It was not my intention to delay, divert, or hamper the progress of the XML Fragment spec. It was also not my intention to imply that the WG overlooked something important. I withdraw my comment since it does not fall under the intended scope of the spec. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 01:16:58 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <000701be66a5$d03f5100$01010101@development.cascade> Forgive me for the generic question, I'm to the point of betting the bank on XML, and I'm looking for a pat on the back, or a voice of warning.... We are starting from scratch on our next generation product, from what I've read and seen - xml seems to fit the bill (Content Management, mixed with WIDL RPC functionality seems right up our alley). I'm looking hard at ODBMS systems and laying out the DB via xml (storing xlm directly). We have a wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML. Do I understand it correctly that I at an item level, I can: 1. name it (URI)? a. possible supply some security to it? 2. revision it? 3. meta-data it? a. can meta-data have meta-data? Would I be foolish to base my whole object system storage on xml, or on ODBMS for that matter? Are they cooked, are they ready for real world apps? Once again, I'm sorry for the generic question, I have read the FAQ's, the ODBMS webpages, several books etc. I'm looking for the advice of those in the trenches - Is it safe to make XML the foundation of my new product? Should I grab a shovel, and jump in the trenches with you, or is this a deep dark hole? Thanks in advance, for all who might reply. Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 5 01:28:11 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <3.0.32.19990304172718.00ba7c80@pop.intergate.bc.ca> At 06:16 PM 3/4/99 -0700, Chad Adams wrote: >Forgive me for the generic question, I'm to the point of betting the bank on >XML, and I'm looking for a pat on the back, or a voice of warning.... You might get more helpful help if you described the problem you're trying to solve. On the other hand, anything that has XML and ODBMS and Java and COM/DCOM in it has to be A Good Thing; ask any analyst or prognosticator. You might have to hire some of those two-headed programmers, though. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Fri Mar 5 01:53:52 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:39 2004 Subject: ModSax Suggestion Message-ID: Hi Everyone, While SAX does a good job as an event-based interface to Parsers, it would be nice to add a few methods to receive a DOM representation back from a reference to an org.xml.sax.Parser. Something like: org.w3c.dom.Document parse(InputSource is, boolean events) throws SAXException; org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws SAXException; /* the events boolean would be to turn on/off event calls. */ If a SAXDriver did not want to produce a DOM, it could either simply return a null or a method added like: boolean isDomCapable(); The above would let me use the ParserFactory to seamlessly switch between Parser implementations and get a DOM tree without building one myself. It is fruitless for me to build a DOM tree when almost all the parser implementations provide that ability. I just want a way to get at that functionality in a simple and standard way (thus SAX). Thoughts? - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Mar 5 01:58:51 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested In-Reply-To: <000701be66a5$d03f5100$01010101@development.cascade> Message-ID: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> I will not comment on the advisability of using an ODBMS, because 1) it's out of scope for this group, and 2) it's a highly religious topic. However, I will comment on the question of whether to store your data directly as XML, and confess that I don't understand the question. XML is a great interchange language; i.e., a way to move data between systems. Generally speaking, however, each particular system has its own optimal internal representation. In an RDBMS, for example, it's tables. In a Java program it's objects, and so forth. There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing). Anyway, my approach would be to store data in the most natural format for the given storage technology, and define translations to and from XML to move data between systems. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Chad Adams Sent: Thursday, March 04, 1999 5:17 PM To: xml-dev@ic.ac.uk Subject: Opinions requested Forgive me for the generic question, I'm to the point of betting the bank on XML, and I'm looking for a pat on the back, or a voice of warning.... We are starting from scratch on our next generation product, from what I've read and seen - xml seems to fit the bill (Content Management, mixed with WIDL RPC functionality seems right up our alley). I'm looking hard at ODBMS systems and laying out the DB via xml (storing xlm directly). We have a wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML. Do I understand it correctly that I at an item level, I can: 1. name it (URI)? a. possible supply some security to it? 2. revision it? 3. meta-data it? a. can meta-data have meta-data? Would I be foolish to base my whole object system storage on xml, or on ODBMS for that matter? Are they cooked, are they ready for real world apps? Once again, I'm sorry for the generic question, I have read the FAQ's, the ODBMS webpages, several books etc. I'm looking for the advice of those in the trenches - Is it safe to make XML the foundation of my new product? Should I grab a shovel, and jump in the trenches with you, or is this a deep dark hole? Thanks in advance, for all who might reply. Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Fri Mar 5 02:18:56 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <8703ae19.36df3add@aol.com> Hi Chad, In a message dated 3/4/99 8:25:02 PM Eastern Standard Time, cadams@cascadecc.com writes: > Forgive me for the generic question, I'm to the point of betting the bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > Before you bet the bank, you need to make sure you are not dependent on any part of the XML family of specifications that are not complete, nor have a variety of stable implementations from different vendors. XML will revolutionize the web ... but the key word there is "will". A small company cannot afford to wait for a market to mature. As one who has been part of a small company that jumped on a technology too soon in the maturity curve (like Java 1.02), I would recommend caution. Best wishes, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Mar 5 03:31:37 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> Message-ID: <36DF4CE1.7F4D3681@simdb.com> "Jeffrey E. Sussna" wrote: > There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing). I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities). As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields -- both of these features can be accessed through arbitrary expressions, which can be complete programs, for instance a field can contain every other word of paragraphs whose parent section has a "priority" attribute with a numerical value less than 5; it also provides arbitrary document fragmenting capabilities at the application level). And the weaknesses are not intrinsic to our model -- we have full structure queries slated for the near future (probably in the next six months). SIM is just one of many XDBMS's avilable on the market, and is one of the fastest, if not _the_ fastest, and most scalable available (at the very least, it is a country mile ahead of (R|OO)DBMS's in terms of XML performance, contrary to the ever-popular notion that the latter are inherently faster than the former -- one client, after migrating their application from a popular RDBMS to SIM, removed the stop button from the query dialog because no-one ever got a chance to see it). Anyway, enough shameless marketing: XDBMS's do exist today, and they do support high performance storage, querying and retrieval. Cheers, Marcelo Cantos http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 06:40:43 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested - more detail on what my thinking is. Message-ID: <000101be66d3$172277a0$01010101@development.cascade> What is AFAIK? Maybe I've been confused by ODBMS products/sells documentation. At least three of them that I have looked at (Object Design, Ardent and Poet) seem to have fairly extensive XML API's as well as other tools that support xml storage in their databases. For example, poet supplies a check-in/check-out utility that is used as a version control system for content management of the xml structure stored directly into the DB. They also supply a browser utility that directly accesses the DB, giving an xml tree navigation, and display - I assume they are using something like microsoft's xml parser that renders to html and displays it. I assume that they are providing a set of java classes that model xml which is then stored directly in the DB (feed it an xlm file it stores an xlm object graph representing the document). I assume upon retrieval it simply streams (maybe as simple as toString())it's xml representation to the xml consumer who parses/renders it per dtd or whatever - no conversion processing is needed in the path until the consumer, keeping speed optimal, (pushing expensive parsing work to the client, relieving a busy server with time to dish up more). verses storing some java non-xml object in the database, you then retrieve the object from the database and wrap the information of the object into xml - and then ship it to some xml consumer, who then parses/renders it back into the non-xml objects form. It also seems to me that if the objects that you are storing are not xlm objects, you have lost the concept of Context Management or at least made it more complex to implement. I think this is where "betting the bank" comes in. To architect the system the second way would be to code a class per possible unique xlm element. You would then need to write classes to pull these "atomic" elements together etc. Upon retrieval you would then create the xlm for transport. This would isolate the DB storage and the client from xml because it would be your own animal, giving you extensibility via normal java class programming. To architect with xlm from the bottom up puts XML Content Management at the very root of the design. You are dependant upon the xlm protocol (not your own custom objects) to give you extensibility. Custom tags, meta-data, naming, versioning, whatever else xlm gives you, must be versatile enough to emulate the java class hierarchies of complex inheritance and aggregation graphs (as used in the option above). This allows for the same authoring tools used to develop content, to also develop navigation and other parameters that will be utilized at run time by the consumer of the xlm. I'm assuming the big buy here is code will only need to be written for the authoring tool, and the xlm consumer. All delivery from the db to the client (even via complex n-tier systems) would require very little, or no coding by us. Client code would parse out the displayable portions to html and display it. An applet would obtain the custom tags, meta-data etc. to make runtime decisions on what to do, based on things that could happen as the user interacts with the page. Our need: Author, name, store, revision, reuse, retrieve - pieces of documents, that can then be combined with other documents, which can in turn be combined with others ... Documents are composed of text, video, audio, graphics ... All the goodies of style sheets etc. would be used. Custom xlm tags would not be published at this time - interoperability with the world is not the driving requirement - ease of transporting documentation + special controls from an n-tier DB system running our code to a thin client running our code is. Meta data and custom tags would be used for several reasons - for example; enhance search and selection algorithms for authoring, bury navigational control data/logic that could be used at run time to help select the next element to display, bury management hooks that would trigger widl rpc to other processes based on runtime states etc. I'm also assuming that an ODBMS could deliver up complex linked, deeply nested xlm documents faster than open/read/closing hundreds of possible files to assemble some document. Concurrent open file handles also present a problem ... Have I missed the boat on what the ODBMS companies with XML Content Management Systems have to offer me? Chad > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Jeffrey E. Sussna > Sent: Thursday, March 04, 1999 6:57 PM > To: 'Chad Adams'; xml-dev@ic.ac.uk > Subject: RE: Opinions requested > > > I will not comment on the advisability of using an ODBMS, because > 1) it's out of scope for this group, and 2) it's a highly > religious topic. However, I will comment on the question of > whether to store your data directly as XML, and confess that I > don't understand the question. XML is a great interchange > language; i.e., a way to move data between systems. Generally > speaking, however, each particular system has its own optimal > internal representation. In an RDBMS, for example, it's tables. > In a Java program it's objects, and so forth. There is not > (AFAIK) yet any such thing as an XDBMS (though you could consider > a file system of XML documements plus a web server to resolve > URL's to those documents as such a thing). Anyway, my approach > would be to store data in the most natural format for the given > storage technology, and define translations to and from XML to > move data between systems. > > Jeff > > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Chad Adams > Sent: Thursday, March 04, 1999 5:17 PM > To: xml-dev@ic.ac.uk > Subject: Opinions requested > > > Forgive me for the generic question, I'm to the point of betting > the bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > > We are starting from scratch on our next generation product, from > what I've > read and seen - xml seems to fit the bill (Content Management, mixed with > WIDL RPC functionality seems right up our alley). I'm looking > hard at ODBMS > systems and laying out the DB via xml (storing xlm directly). We have a > wealth of in-house Java and COM/DCOM experience, but none with > ODBMS or XML. > > Do I understand it correctly that I at an item level, I can: > 1. name it (URI)? > a. possible supply some security to it? > 2. revision it? > 3. meta-data it? > a. can meta-data have meta-data? > > Would I be foolish to base my whole object system storage on xml, or on > ODBMS for that matter? Are they cooked, are they ready for real > world apps? > > Once again, I'm sorry for the generic question, I have read the FAQ's, the > ODBMS webpages, several books etc. I'm looking for the advice of those in > the trenches - Is it safe to make XML the foundation of my new product? > > Should I grab a shovel, and jump in the trenches with you, or is > this a deep > dark hole? > > > Thanks in advance, for all who might reply. > > > Chad Adams > Payback Training Systems > Email: cadams@cascadecc.com > Phone: 435-654-6304 > fax: 435-654-1482 > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Fri Mar 5 07:23:19 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> Message-ID: <36DF864B.B458299D@fiduciary.com> Marcelo Cantos wrote: > "Jeffrey E. Sussna" wrote: > > > There is not (AFAIK) yet any such thing as an XDBMS > > I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities). As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields In addition to this vision of an XML database, there has been much discussion of XML as a front end or a query-and-response framework for data stores, but I would argue that such applications of XML markup are not an XML database. A true XML database is shaped by the essential characteristics of XML itself: it should be freely eXtensible; it should be defined and manipulated by Markup; and it should be cast in a Document Structure within which Elements identify Data Constructs, and Attributes provide Data Characterization. Like XML itself, the XML database is fundamentally mismatched to the familiar storage and transmission frameworks of filesystem, relational table, object serialization or data stream. In the first case, any item--document, data table, or executable--whether 'text' or binary--which is committed to storage in a filesystem is treated as a file: that is, as unitary and indivisible within the perspective and capabilities of the filesystem. A word processing program may, by opening a document, be able to identify and to manipulate as individual elements the sentences, paragraphs and chapters of that document. By contrast, the filesystem in which that document is stored reads, writes, renames, searches for or deletes the document as a whole. In XML terms, the filesystem sees the document as a single element--a root. Regardless of how many subelements we might mark up within that , the filesystem--designed for a generic 'file-like' document, is capable of manipulating only one. In a similar way, a relational table--and the database engine behind it--can store, index, or construct joins upon only those data records which correspond to the schema of the table. While it is possible to use SQL or proprietary database tools to rewrite an existing table to a different schema, that is substantially different from submitting to a database engine, as an entry to a particular table, a single record which follows a unique schema of its own. In the terms of both filesystem and relational table, an XML document is effectively a BLOB, in that its specifically XML structure is outside the ability of either to discern or to make any use of. Just as, for example, with audio or video content more commonly recognized as BLOBs, the filesystem or relational database engine is obliged to invoke a particular, content-specific processor in order to understand, and then to implement, the structure conveyed by markup in every XML document. Yet this need for pre-defined, content-specific handlers obviates the benefits of XML as a general solution. Indeed, it is not really XML at all if the markup possibilities are circumscribed by the need to conform to what a pre-defined handler can implement. XML, by definition, is freely extensible. This fundamental characteristic trumps any hoped-for convenience in processing to be achieved by defining 'standard' tagsets, industry-wide 'domain' procedures, or normative namespace references. That this essential capability of XML is irreconcilably mismatched to conventional filesystems and relational databases means that if we are building true XML tools we are obliged to create new equivalents of the filesystem and the database which do conform to the extensible nature of XML. 'Internally' extensibility means that the structural definition of existing XML documents may be altered at any time by indicating, in a document instance, new subelements of the elements previously defined or, occasionally, consolidating--and eliminating--previously defined elements in favor of more general ones. This is not simple re-arrangement of the elements of an XML document, but a fundamental re-definition of its structure. 'Externally' the extensibility of XML means that documents, arriving from any number of (not necessarily well-known) sources, may claim recognition by our XML database engine and expect, for example, to be accepted as input data, solely because the document root element has a tag which matches one defined in our system. Of course, below that apparently familiar root element may lie subelements whose type we have not seen before, or which are structured in a different hierarchy than we expect, or whose tag names are unfamiliar variants of what we use 'internally'. A true XML database engine must inherently and efficiently handle the demands of both this internal and external extensibility. Effectively this means that the data schema must (potentially) be rewritten with every new 'record' accepted, or altered, in the database. That is, if we posit that those 'records' are XML documents then, as XML documents, they may be marked up at any time to a finer (or coarser) elemental granularity, and a true XML database engine must respond by reading, writing, querying, and generally processing them in sync with the markup. In the case of 'external' items?effectively data entry submitted to the XML database?the database engine must identify the schema with the data source. That is, it must understand that the markup of items originating from one source may be aliases of the markup in documents from another source and, again, may present a finer or coarser elemental granularity than analogous documents from a different source. What is missing in this, of course, is the traditional role of the DTD for validation. It is omitted because XML 1.0 defines two very different markup and processing disciplines, distinguished by whether there is a DTD, and in order to build XML tools it is necessary to choose which of these definitions we are following. XML is routinely introduced as both of its very different selves. Newcomers are usually first lured in with the promise of unlimited markup: define your own tags which exactly suit your unique situation. Only after they have bitten for that bait are they told about the limitations imposed by the DTD. Yet the fact is that XML 1.0 defines one XML in which the DTD is omitted, and a simple and logical projection of that definition leads to an XML where markup is freely extensible and the data schema is what the sum of the markup in the system at any moment implies. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 08:50:40 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested In-Reply-To: <36DF864B.B458299D@fiduciary.com> Message-ID: <000301be66e5$33f1c860$01010101@development.cascade> Walter, Thanks for the reply. If I understand what you are saying, it does seem kind of weird that they would spec the DTD instead of just going with the schema - since that's what schema is for. Also, having taken the bait, my assumption was that any given xml document might be a mixture of both (ie. several dtd schemes + several free floating custom tags with schema all mixed into one happy root) If the consumer of the file knows what they are looking at (either dtd or custom tag wise)- doit, otherwise ignore it. Is it not this simple? Your paragraph on "XML, by definition, is freely extensible ..." as well as the following paragraph describes what I hope the XLM Content Management classes supplied by the ODBMS manufactures would do for me. I'm not sure if this is considered "overloading" the functionality of Content Management, but I believe is one of the concepts of XML. I not only want the implied authoring flexibility of content management (arrange text, video, audio, graphics etc. into segments and sub-segments) on the data store side, but also to embed custom elements (in or around the displayable elements) that determine some runtime programmatic behavior of the consumer of the document. As yet another overloading but as a secondary functionality to the content management, I'm also hoping that the use of XML can be used in what you have implied might be an impure use - that of a query-and-response mechanism. If I can avoid licensing yet another product, to get mine to market ie. objectspace, weblogic, or coding to rmi or some other remoting technology, happy day! Am I looking for the silver bullet that does not exist? Chad > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > W. E. Perry > Sent: Friday, March 05, 1999 12:23 AM > To: xml-dev@ic.ac.uk > Subject: Re: Opinions requested > > > Marcelo Cantos wrote: > > > "Jeffrey E. Sussna" wrote: > > > > > There is not (AFAIK) yet any such thing as an XDBMS > > > > I am continually surprised to hear remarks such as this. SIM > _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with > structure and full content query capabilities). As an XDBMS it > has weaknesses (it only supports predefined indexes and limited > structure querying), but in some ways provides a model that is > even richer than XML (it provides structure below element level, > and has the concept of fields > > In addition to this vision of an XML database, there has been > much discussion of XML as a front end or a query-and-response > framework for data stores, but I would argue that such > applications of XML markup are not an XML database. A true XML > database is shaped by the essential characteristics of XML > itself: it should be freely eXtensible; it should be defined and > manipulated by Markup; and it should be cast in a Document > Structure within which Elements identify Data Constructs, and > Attributes provide Data Characterization. > > Like XML itself, the XML database is fundamentally mismatched to > the familiar storage and transmission frameworks of filesystem, > relational table, object serialization or data stream. In the > first case, any item--document, data table, or > executable--whether 'text' or binary--which is committed to > storage in a filesystem is treated as a file: that is, as > unitary and indivisible within the perspective and capabilities > of the filesystem. A word processing program may, by opening a > document, be able to identify and to manipulate as individual > elements the sentences, paragraphs and chapters of that document. > By contrast, the filesystem in which that document is stored > reads, writes, renames, searches for or deletes the document as a > whole. In XML terms, the filesystem sees the document as a single > element--a root. Regardless of how many subelements we might mark > up within that , the > filesystem--designed for a generic 'file-like' document, is > capable of manipulating only one. > > In a similar way, a relational table--and the database engine > behind it--can store, index, or construct joins upon only those > data records which correspond to the schema of the table. While > it is possible to use SQL or proprietary database tools to > rewrite an existing table to a different schema, that is > substantially different from submitting to a database engine, as > an entry to a particular table, a single record which follows a > unique schema of its own. > > In the terms of both filesystem and relational table, an XML > document is effectively a BLOB, in that its specifically XML > structure is outside the ability of either to discern or to make > any use of. Just as, for example, with audio or video content > more commonly recognized as BLOBs, the filesystem or relational > database engine is obliged to invoke a particular, > content-specific processor in order to understand, and then to > implement, the structure conveyed by markup in every XML > document. Yet this need for pre-defined, content-specific > handlers obviates the benefits of XML as a general solution. > Indeed, it is not really XML at all if the markup possibilities > are circumscribed by the need to conform to what a pre-defined > handler can implement. > > XML, by definition, is freely extensible. This fundamental > characteristic trumps any hoped-for convenience in processing to > be achieved by defining 'standard' tagsets, industry-wide > 'domain' procedures, or normative namespace references. That this > essential capability of XML is irreconcilably mismatched to > conventional filesystems and relational databases means that if > we are building true XML tools we are obliged to create new > equivalents of the filesystem and the database which do conform > to the extensible nature of XML. 'Internally' extensibility means > that the structural definition of existing XML documents may be > altered at any time by indicating, in a document instance, new > subelements of the elements previously defined or, occasionally, > consolidating--and eliminating--previously defined elements in > favor of more general ones. This is not simple re-arrangement of > the elements of an XML > document, but a fundamental re-definition of its structure. > 'Externally' the extensibility of XML means that documents, > arriving from any number of (not necessarily well-known) sources, > may claim recognition by our XML database engine and expect, for > example, to be accepted as input data, solely because the > document root element has a tag which matches one defined in our > system. Of course, below that apparently familiar root element > may lie subelements whose type we have not seen before, or which > are structured in a different hierarchy than we expect, or whose > tag names are unfamiliar variants of what we use 'internally'. > > A true XML database engine must inherently and efficiently handle > the demands of both this internal and external extensibility. > Effectively this means that the data schema must (potentially) be > rewritten with every new 'record' accepted, or altered, in the > database. That is, if we posit that those 'records' are XML > documents then, as XML documents, they may be marked up at any > time to a finer (or coarser) elemental granularity, and a true > XML database engine must respond by reading, writing, querying, > and generally processing them in sync with the markup. In the > case of 'external' items?effectively data entry submitted to the > XML database?the database engine must identify the schema with > the data source. That is, it must understand that the markup of > items originating from one source may be aliases of the markup in > documents from another source and, again, may present a finer or coarser > elemental granularity than analogous documents from a different source. > > What is missing in this, of course, is the traditional role of > the DTD for validation. It is omitted because XML 1.0 defines two > very different markup and processing disciplines, distinguished > by whether there is a DTD, and in order to build XML tools it is > necessary to choose which of these definitions we are following. > XML is routinely introduced as both of its very different selves. > Newcomers are usually first lured in with the promise of > unlimited markup: define your own tags which exactly suit your > unique situation. Only after they have bitten for that bait are > they told about the limitations imposed by the DTD. Yet the fact > is that XML 1.0 defines one XML in which the DTD is omitted, and > a simple and logical projection of that definition leads to an > XML where markup is freely extensible and the data schema is what > the sum of the markup in the system at any moment implies. > > Respectfully, > > Walter Perry > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From s861766 at mail86.yzu.edu.tw Fri Mar 5 09:31:53 1999 From: s861766 at mail86.yzu.edu.tw (Ephese Yang) Date: Mon Jun 7 17:09:40 2004 Subject: A question about XSL/IE5... Message-ID: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw> Hi: I am new in xsl and I have some question about xsl and IE5. Does IE5 beta2 support the flow object in xsl spec.?     ex:  fo:block How can I display a figure in xml file using xsl? Can somebody give me an example? Thanks! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From santi at qsystems.es Fri Mar 5 09:59:10 1999 From: santi at qsystems.es (Santi) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. Message-ID: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS> Hello, I've started some days ago in XML. Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. Thank you very much in advance. Santi Rivas santi@qsystems.es xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david.hitch at dial.pipex.com Fri Mar 5 10:21:55 1999 From: david.hitch at dial.pipex.com (David Hitchcock) Date: Mon Jun 7 17:09:40 2004 Subject: XML tutorial Message-ID: <01be66e9$3f1d23c0$0100007f@ketlux03> Hi Santi We have a number of resources including links to tutorials on the El.pub website at: http://www.pira.co.uk/IE . The XML material is on the standards page: http://www.pira.co.uk/IE/top011a.htm and there is also a comprehensive list of commercial and shareware products on the products page: http://www.pira.co.uk/IE/base09.htm#SGML You may also wish to sign up for the free weekly information service: El.pub Weekly which keeps you informed on a weekly basis of updated news on the site. You can subscribe from the welcome page at: http://www.pira.co.uk/IE The site is run by IESERV2 which supports the advanced electronic publishing research and development projects throughout Europe, run by the Information Engineering sector of the European Commission's DG XIII/E under the Telematics Applications Programme. Best --> David ********************************* David Hitchcock IESERV2 tel: +44/ (0)181 255 7084 +44/ (0)181 255 7085 email: david.hitch@dial.pipex.com web: http://www.pira.co.uk/IE ********************************* El.pub: http://www.pira.co.uk/IE Interactive publishing - news and resources **Join our developing community subscribe to the *NEW* El.pub Weekly a *free* text email update service which includes the week's news items and associated URLs** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 5 12:44:20 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. Message-ID: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au> >I've started some days ago in XML. >Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. see http://www.xmlinfo.com/newcomers/ for links introducing XML. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Fri Mar 5 12:47:13 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. In-Reply-To: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS> Message-ID: On Fri, 5 Mar 1999, Santi wrote: > Hello, > > I've started some days ago in XML. > Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. IBM has a nice XML tutorial at: http://www.software.ibm.com/xml/education/tutorial-prog/writing.html You may also find other useful introductions in the list at: http://www.oasis-open.org/cover/xmlIntro.html -robin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Fri Mar 5 13:44:01 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:40 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Wed, 03 Mar 1999 14:30:52 -0500. Message-ID: <199903051343.OAA07560@brown.informatik.uni-dortmund.de> > Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls > reads the xml.soc catalog that tells it to use the SGML Declaration > for XML, xml.dcl. That SGML Declaration tells nsgmls what hexadecimal > character references look like. Without it, things like &x2014; are > being interpreted as per ISO 8879:1986, which isn't doing you or the > parser any good. > > Regards, > > > Tony Graham Thanks to everybody who answered my question. Thanks to Tony. Yes, you're right, with -c... it works. I'm was bit confused about that because I have 'Set the SGML_CATALOG_FILES environment variable to point to the file pubtext/xml.soc' as explained in http://www.jclark.com/sp/xml.htm. In fact I used my own old catalog file. Now I checked the xml.dcl that I used and the one that is part of sp: Unfortunately I used "ISO 8879:1986 (ENR)" instead of "ISO 8879:1986 (WWW)" :-( Thanks for your help! Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 5 15:53:32 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:40 2004 Subject: XML MULTI-Fragment Interchange? In-Reply-To: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> Message-ID: <199903051526.KAA17396@hesketh.net> At 04:37 PM 3/4/99 -0800, Don Park wrote: >>Of course, if you seriously believe that the spec is useless unless it >>allows multiple fragment bodies per package, then that is a comment you >>should make and attempt to support. We don't want to come out with a >>spec folks think is useless, but we were trying to keep it as minimal >>as possible while still addressing the problem we defined as our scope. > > >I found the spec very useful, timely, and clear. It was not my intention to >delay, divert, or hamper the progress of the XML Fragment spec. It was also >not my intention to imply that the WG overlooked something important. > >I withdraw my comment since it does not fall under the intended scope of the >spec. While you may be withdrawing the comment because of the scope the XML Fragment group has set itself, we still need a way to represent multiple fragments, whether or not the W3C considers that appropriate to the scope of this particular working group. Sounds like we need to get the XML streaming thread going again, and start working out ways to represent multiple documents/fragments. It seems like a real need. Is anyone interested in this issue going to be at XTech next week? It'd be culture shock to actually talk, I know, but that might be a good place to get a spec for these streaming XML issues kickstarted. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From asmith at drumbeat.com Fri Mar 5 17:19:34 1999 From: asmith at drumbeat.com (Smith, Adrian) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested Message-ID: <70B92603FC2CD21197D600609778A80D0AE64D@elemental2> There actually is an XDBMS. It predates XML. This dates back to around 1965/1966. The database created was titled "IMS" for Information Management System, it was created by IBM and used an hierarchical model for the data. It had all the same characterstics of XML with almost the exact same set of constructs and shortcomings. Thanks! Adrian Worthless. -Sir George Bidell Airy, KCB, MA, LLD, DCL, FRS, FRAS (Astronomer Royal of Great Britain), estimating for the Chancellor of the Exchequer the potential value of the "analytical engine" invented by Charles Babbage, September 15, 1842. > -----Original Message----- > From: Jeffrey E. Sussna [SMTP:jes@kuantech.com] > Sent: Thursday, March 04, 1999 5:57 PM > To: 'Chad Adams'; xml-dev@ic.ac.uk > Subject: RE: Opinions requested > > I will not comment on the advisability of using an ODBMS, because 1) > it's out of scope for this group, and 2) it's a highly religious > topic. However, I will comment on the question of whether to store > your data directly as XML, and confess that I don't understand the > question. XML is a great interchange language; i.e., a way to move > data between systems. Generally speaking, however, each particular > system has its own optimal internal representation. In an RDBMS, for > example, it's tables. In a Java program it's objects, and so forth. > There is not (AFAIK) yet any such thing as an XDBMS (though you could > consider a file system of XML documements plus a web server to resolve > URL's to those documents as such a thing). Anyway, my approach would > be to store data in the most natural format for the given storage > technology, and define translations to and from XML to move data > between systems. > > Jeff > > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf > Of > Chad Adams > Sent: Thursday, March 04, 1999 5:17 PM > To: xml-dev@ic.ac.uk > Subject: Opinions requested > > > Forgive me for the generic question, I'm to the point of betting the > bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > > We are starting from scratch on our next generation product, from what > I've > read and seen - xml seems to fit the bill (Content Management, mixed > with > WIDL RPC functionality seems right up our alley). I'm looking hard at > ODBMS > systems and laying out the DB via xml (storing xlm directly). We have > a > wealth of in-house Java and COM/DCOM experience, but none with ODBMS > or XML. > > Do I understand it correctly that I at an item level, I can: > 1. name it (URI)? > a. possible supply some security to it? > 2. revision it? > 3. meta-data it? > a. can meta-data have meta-data? > > Would I be foolish to base my whole object system storage on xml, or > on > ODBMS for that matter? Are they cooked, are they ready for real world > apps? > > Once again, I'm sorry for the generic question, I have read the FAQ's, > the > ODBMS webpages, several books etc. I'm looking for the advice of > those in > the trenches - Is it safe to make XML the foundation of my new > product? > > Should I grab a shovel, and jump in the trenches with you, or is this > a deep > dark hole? > > > Thanks in advance, for all who might reply. > > > Chad Adams > Payback Training Systems > Email: cadams@cascadecc.com > Phone: 435-654-6304 > fax: 435-654-1482 > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Veillard at w3.org Fri Mar 5 17:20:30 1999 From: Daniel.Veillard at w3.org (Daniel Veillard) Date: Mon Jun 7 17:09:41 2004 Subject: XML MULTI-Fragment Interchange? In-Reply-To: <199903051526.KAA17396@hesketh.net>; from Simon St.Laurent on Fri, Mar 05, 1999 at 10:29:09AM -0500 References: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> <199903051526.KAA17396@hesketh.net> Message-ID: <19990305121926.E22737@w3.org> On Fri, Mar 05, 1999 at 10:29:09AM -0500, Simon St.Laurent wrote: > At 04:37 PM 3/4/99 -0800, Don Park wrote: > >I withdraw my comment since it does not fall under the intended scope of the > >spec. > > While you may be withdrawing the comment because of the scope the XML > Fragment group has set itself, we still need a way to represent multiple > fragments, whether or not the W3C considers that appropriate to the scope > of this particular working group. > > Sounds like we need to get the XML streaming thread going again, and start > working out ways to represent multiple documents/fragments. It seems like > a real need. Hum, I have been following the streaming/fragment thread. However I have the feeling that even multiple fragment body extensions would not solve the problem you were facing. If I didn't get the discussion wrong, it seems that you rather tried to make one very big (i.e. stream) document from multiple sources while the scope of the fragment work was just the opposite, i.e. how to extract and ship a piece of a very big document. > Is anyone interested in this issue going to be at XTech next week? It'd be > culture shock to actually talk, I know, but that might be a good place to > get a spec for these streaming XML issues kickstarted. I will be around, Daniel -- [Yes, I have moved back to France !] Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux, WWW, rpmfind, Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | rpm2html, XML, http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Fri Mar 5 17:26:09 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:41 2004 Subject: A question about XSL/IE5... In-Reply-To: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw> (message from Ephese Yang on Fri, 05 Mar 1999 16:53:52 +0800) Message-ID: <199903051532.KAA27149@ruby.ora.com> [Ephese Yang] > I am new in xsl and I have some question about xsl and IE5. > Does IE5 beta2 support the flow object in xsl spec.? >     ex:  fo:block IE5 does not support XSL formatting objects. Tell Microsoft you are interested that it do so. XSL questions are best discussed on the xsl-list: . > How can I display a figure in xml file using xsl? > Can somebody give me an example? Since MSIE can only display HTML, try creating an HTML element in your stylesheet. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Fri Mar 5 17:48:34 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <36DF4CE1.7F4D3681@simdb.com> References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> Message-ID: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote: >>"Jeffrey E. Sussna" wrote: >> >> There is not (AFAIK) yet any such thing as an XDBMS (though you could consider >>a file system of XML documements plus a web server to resolve URL's to those >>documents as such a thing). > >I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS >(it is also an SGML, MARC, RTF, etc. database with structure and full content >query capabilities). I think one of the reasons you hear these kinds of remarks is that the terminology surrounding these systems is used differently by different folks. For instance, from what I know of SIM, I wouldn't call it a DBMS system of any kind, as I don't believe (I could be wrong) it supports referential integrity constraints, concurrency control, recoverable transactions, and other features I would expect out of a reasonable DBMS. Granted it has hooks that allow you to get it to work with a DBMS that can provide all that, but that doesn't make SIM itself a DBMS. I would instead class SIM as an information retrieval system, and a pretty damned good one at that. However, SIM performs as well as it does in great part because it's not doing the extra work that a DBMS should do, and which add greatly to retrieval time from database systems (as well as limiting their ability to handle complex data formats gracefully). This isn't to knock SIM; anyone who needs a flexible information retrieval system should be taking a very serious look at it. The Z39.50 support alone puts it way ahead of the market as far as I'm concerned. But I don't think SIM is evidence that there are DBMS systems that handle SGML/XML well; I don't think they do. Oracle may very well be getting there with its latest release, but I suspect there's still a lot of work to be done there. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 5 21:20:14 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:41 2004 Subject: Tell the world about your new language Message-ID: <3.0.32.19990305131959.00b65280@pop.intergate.bc.ca> Check out: http://www.usenix.org/events/dsl99/ -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Mar 5 22:41:58 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:41 2004 Subject: ModSax Suggestion References: Message-ID: <36E05C67.607F4C27@eng.sun.com> Interesting suggestion for a big hole in the parts of the Java API set that are more or less "standard" at this poit -- SAX and DOM. One comment though: I've found that it's important to be able to have options controlling how the DOM tree is built. For example, whether to discard ignorable spaces, or do namespace conformance enforcement, or try to get CDATA sections (comments, etc). Accordingly, I think being able to do a bit more than this will be important. - Dave MikeDacon@aol.com wrote: > > Hi Everyone, > > While SAX does a good job as an event-based interface > to Parsers, it would be nice to add a few methods to > receive a DOM representation back from a reference to an org.xml.sax.Parser. > > Something like: > > org.w3c.dom.Document parse(InputSource is, boolean events) throws > SAXException; > org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws > SAXException; > /* the events boolean would be to turn on/off event calls. */ > > If a SAXDriver did not want to produce a DOM, it could either simply > return a null or a method added like: > > boolean isDomCapable(); > > The above would let me use the ParserFactory to seamlessly switch > between Parser implementations and get a DOM tree without building > one myself. It is fruitless for me to build a DOM tree when almost all > the parser implementations provide that ability. I just want a way to get > at that functionality in a simple and standard way (thus SAX). > > Thoughts? > > - Mike > ----------------------------------------------- > Michael C. Daconta > Author of Java 2 and JavaScript for C/C++ Programmers > Author of C++ Pointers and Dynamic Memory Management > Sun Certified Java Programmer and Developer > http://www.gosynergy.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From zmin at atpage.com Sat Mar 6 01:04:43 1999 From: zmin at atpage.com (min zheng) Date: Mon Jun 7 17:09:41 2004 Subject: Accessing DTD info. in IE5 References: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com> Message-ID: <002d01be676d$eee8e850$f66f6f0a@atpage> Is DTD information accessable through IE5 DOM? I took is as granted because I could do it with old MSXML for java used in IE4. However, when I really wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD information exposed in IE5 DOM? Thanks, Min xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Mar 6 06:45:35 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <36DF864B.B458299D@fiduciary.com>; from W. E. Perry on Fri, Mar 05, 1999 at 02:22:51AM -0500 References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com> Message-ID: <19990306153959.A22308@io.mds.rmit.edu.au> Thank you, Walter for the erudite response. I am left in a bit of quandary as to how or even whether to respond. This is in large part due to the fact that, while your post was in response to mine, it is not immediately clear to me whether you are addressing my comments specifically or rather the general theme of this thread. Having the vague impression (though no firm conviction) that it is in response to my claims that you waxed eloquent on the theme of what defines an XML database, I will proceed to provide commentary, and occasionally direct response/rebuttal, to a smattering of your points. My humble apologies, Walter, if I have in any way misconstrued your post. On Fri, Mar 05, 1999 at 02:22:51AM -0500, W. E. Perry wrote: > Marcelo Cantos wrote: > > > "Jeffrey E. Sussna" wrote: > > > > > There is not (AFAIK) yet any such thing as an XDBMS > > > > I am continually surprised to hear remarks such as this. SIM _is_ > > an XDBMS (it is also an SGML, MARC, RTF, etc. database with > > structure and full content query capabilities). As an XDBMS it > > has weaknesses (it only supports predefined indexes and limited > > structure querying), but in some ways provides a model that is > > even richer than XML (it provides structure below element level, > > and has the concept of fields > > In addition to this vision of an XML database, there has been much > discussion of XML as a front end or a query-and-response framework > for data stores, but I would argue that such applications of XML > markup are not an XML database. A true XML database is shaped by the > essential characteristics of XML itself: it should be freely > eXtensible; it should be defined and manipulated by Markup; and it > should be cast in a Document Structure within which Elements > identify Data Constructs, and Attributes provide Data > Characterization. It seems here that I may have provided an incorrect characterisation of what we do, and hence given Walter cause to provide some qualifiers on anyone wishing to define themselves as an XML database. On this point, I must make it quite clear that SIM is _not_ an XML front end to a data store. It is an XML (etc.) document repository. One additional, crucial point is that SIM _is_ extensible (though I will qualify this presently). It can be defined to accept markup to any degree of strictness or laxity (within the bounds of well-formedness or validity, of course). It can be setup to accept any and all markup and do _something_ intelligent with it. It can also be configured to make stringent demands (well in excess of the DTD, both with respect to strictness and complexity of constraints) of its inputs. This quality of SIM renders the product amenable to both of the major application streams of XML: data and documents. It can provide strict data validation as well as extensibility. Now, by way of qualification, SIM does not provide free-form runtime extensibility (runtime from the administrator's perspective, not ours). Rather it provides the application developer with the requisite tools to define, at design time, what structures will be supported. For instance, you cannot, with SIM, perform queries such as, "find me all sections containing subsections with an attribute of security="public" and at least one paragraph with fewer than four words in it" The semantic complexity of such a query is beyond the scope of our product. However, if one were to know in advance that queries about the minimum paragraph length in public subsections will be commonplace in the particular application one is developing, then SIM could, at design time, be told to create an appropriate index and then the above query could, indeed, be performed. In short, SIM _is_ extensible, but the extensibility is bound somewhat earlier than runtime. In practice, clients never complain about this quality. In fact, it is usually a benefit rather than a hindrance, for the same reason that compile time type checking is a good thing to have in a programming language. I also take issue with Walter's remark that an XML database should be manipulated by and defined through the medium of XML. This sounds analogous to suggesting that relational databases should be defined and manipulated by markup. Now, it is true that relational schema are, themselves, typically stored as relations (one will, for example, find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc. inside a database). However, it seems to me patently absurd to suggest that SQL (whether DML or DDL) be expressed in terms of tuples and relations. Now, while it does not seem likewise absurd to suggest that XML queries and data definition constructs be defined as XML, the truth of such a suggestion is anything but self-evident. Why should one not use an SQL-like language to define and query XML databases? There may or may not be merit in such an approach, but it seems no more or less appropriate than a query/data definition language cast in XML. Indeed, many of the query language position papers at W3C do not use XML syntax. Data definition and query languages are meta-constructs. They are not part of the data, but rather operate on the data and structures. This suggests that while it may be possible to fold the system in on itself by expressing meta-structure as data, it would be unwise to proceed down this path in _a priori_ fashion (Now, have I completely missed Walter's point here? I'm not sure.) > Like XML itself, the XML database is fundamentally mismatched to the > familiar storage and transmission frameworks of filesystem, > relational table, object serialization or data stream. In the first > case, any item--document, data table, or executable--whether 'text' > or binary--which is committed to storage in a filesystem is treated > as a file: that is, as unitary and indivisible within the > perspective and capabilities of the filesystem. A word processing > program may, by opening a document, be able to identify and to > manipulate as individual elements the sentences, paragraphs and > chapters of that document. By contrast, the filesystem in which > that document is stored reads, writes, renames, searches for or > deletes the document as a whole. In XML terms, the filesystem sees > the document as a single element--a root. Regardless of how many > subelements we might mark up within that , the > filesystem--designed for a generic 'file-like' document, is capable > of manipulating only one. One must be careful, here, to discriminate between interfaces and implementations. I basically agree with all of Walter's points in the above paragraph, but would add that many systems store conceptual XML documents as files. Our system uses a highly tuned variable length record manager (unsurprisingly named the VLRM) to store documents and fragments of any size in a highly efficient manner (both in terms of size and speed). Consequently, we store entire documents for the most part. If parsing time starts to weigh heavily due to retrieval of excessively large documents (the entire Australian Tax Legislation, say, or a complete Boeing Aircraft Maintanence Manual), then we fragment the documents to a level where parsing is no longer a bottleneck. In all of this, however, SIM can always treat the XML as XML. The developer always sees trees, not files, or BLOB's. It doesn't matter how it is stored in the background, that is an implementation issue. The one caveat with our product is that fragmented documents cannot be treated as a conceptual whole without physically rejoining the parts. This is one thing which OODBMS's do better than us present, though we are looking at ways to provide that additional level of abstraction (we are also considering the usefulness of doing so, since fragments are more commonly the unit of interest, rather than the entire document). > In the terms of both filesystem and relational table, an XML > document is effectively a BLOB, in that its specifically XML > structure is outside the ability of either to discern or to make any > use of. Just as, for example, with audio or video content more > commonly recognized as BLOBs, the filesystem or relational database > engine is obliged to invoke a particular, content-specific processor > in order to understand, and then to implement, the structure > conveyed by markup in every XML document. Yet this need for > pre-defined, content-specific handlers obviates the benefits of XML > as a general solution. Indeed, it is not really XML at all if the > markup possibilities are circumscribed by the need to conform to > what a pre-defined handler can implement. I disagree with the last sentence above. Not from the pedagogical perspective (which seems quite evident in Walter's prose, and with which I largely sympathise), but from the pragmatic perspective. Yes, the purist will rightly decry the notion of predefinition of structure in an ostensibly XML-friendly environment, but the end-user comes along and not only accepts, but vociferously demands that his environment be constrained. The user doesn't want flexibility to store anything, she wants the flexibility only to store what she wants to store. The serious user of XML does not have a heterogeneous collection of vaguely defined documents with a motley crew of DTD's and well-formed markup. Most users have a well defined data set for which they want to define efficient structures for storage and retrieval (if they aren't interested in efficiency then their problem isn't particularly interesting -- any tool will do). In the few cases where they do have arbitrary structure to deal with, more often than not they are only interested in the content and are likely to throw the structure away. After all, what is the use of structure if you don't know, say, whether the prolog element contains an abstract element, or whether "date" attributes refer to creation time, last modification time, or effectivity (or, worse still, whether they are in U.S., Australian or international format)? In the real world, I suspect that cases where structure is arbitrary but important will be few and far between. This is borne out by the almost complete absense of demand for arbitrary structure querying capability from our clients or potential clients. It just never seems to be an issue. A qualifier is also in order for the above remarks, lest there be a misunderstanding. XML tools, in general, must be extensible and accept any and all valid and/or well-formed inputs. My comments specifically address the issue of repositories (DBMS's). XML may be extensible, but it, too, expresses the notion of constraint through the concept of DTD's. Databases, likewise, not only can, but should constraint the inputs, both for simplicity and efficiency. Perhaps this is, after all, what Walter meant when repudiating the idea of predefined handlers. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Mar 6 08:46:24 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>; from Jerome McDonough on Fri, Mar 05, 1999 at 09:37:29AM -0800 References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> Message-ID: <19990306154022.B22308@io.mds.rmit.edu.au> On Fri, Mar 05, 1999 at 09:37:29AM -0800, Jerome McDonough wrote: > At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote: > >>"Jeffrey E. Sussna" wrote: > >> > >> There is not (AFAIK) yet any such thing as an XDBMS (though you > >> could consider a file system of XML documements plus a web server > >> to resolve URL's to those documents as such a thing). > > > >I am continually surprised to hear remarks such as this. SIM _is_ > >an XDBMS (it is also an SGML, MARC, RTF, etc. database with > >structure and full content query capabilities). > > I think one of the reasons you hear these kinds of remarks is that > the terminology surrounding these systems is used differently by > different folks. For instance, from what I know of SIM, I wouldn't > call it a DBMS system of any kind, as I don't believe (I could be > wrong) it supports referential integrity constraints, concurrency > control, recoverable transactions, and other features I would expect > out of a reasonable DBMS. Granted it has hooks that allow you to > get it to work with a DBMS that can provide all that, but that > doesn't make SIM itself a DBMS. I would instead class SIM as an > information retrieval system, and a pretty damned good one at that. > However, SIM performs as well as it does in great part because it's > not doing the extra work that a DBMS should do, and which add > greatly to retrieval time from database systems (as well as limiting > their ability to handle complex data formats gracefully). Thank you, Jerome, for the candid and quite fair assessment of SIM. On the point of referential integrity, you are quite right, there is no built in support. Though with our new event hook mechanism (similar to the triggers found in most relational systems) one will be able to attach event handlers to various update operations, and prevent them from completing in the event of a referential integrity violation. This probably wouldn't work together with concurrency controls (thought this will be moot when transaction support comes in). However, in one particular project, we have put in referential integrity control using a single query per reference as part of the check-in mechanism. Another project only generates references dynamically at query time effectively with a single reverse-reference index lookup at query time. The problem with referential integrity checking is sometimes you need to be able to manage broken data and this is more often the case with documents than with the more typical applications of RDBMS technology (financial transactions etc). Of course when you store whole documents instead of unnaturally breaking them up into millions of tiny pieces, you don't have nearly the same referential integrity problems in the first place. With respect to concurrency control you are mistaken. We support short term locks, which prevent individual records, at least, from ever entering an undefined state under concurrent loads. These locks can be held as long as desired, but cannot persist beyond the lifetime of a session. Long term locks (which outlive the session) are in the offing, and stand a good chance of getting into release 3.0 (scheduled for mid-year, I think -- it could be earlier). Transactions we most definitely do not support. We do, however, provide recovery through log files, which record server activity and can be played back in a batch load operation. It's a little crude (you make the server read-only, back it up, and start a new log file. When you crash, restore the last backup and replay the log) but it is safe and effective. More important than any specifics, however, is the issue of what you call a DBMS. To me, a DBMS is a database management system (seems painfully obvious, but I think it bears repeating). You may argue that a product is not a DBMS if it does not support feature X, and I don't entirely disagree. When one talks of a DBMS one is conjuring up a certain image in the mind of the listener, and that image may well include feature X. To be fair to SIM, however, the essence of a DBMS is that it manages a collection of data. If it doesn't support transactions, this does not entail that it does not manage data. Rather it simply has limits on the way the data is managed (i.e. it doesn't manage data as well as one would like). You clearly believe that transaction support is part of the essence of what makes a DBMS. I disagree, indeed, I profoundly disagree. There is nothing in the concept of a database that mandates any such requirement. Rather I would say that transaction support is an important issue for any _good_ DBMS. Likewise for referential integrity and concurrency (and, for that matter, support for declarative queries, use of indexes, a rich set of fundamental data types, etc.). If I recall correctly, dBase III was generally acknowledged to be a DBMS though it lacked most of these requirements, and could barely even call itself relational! Now, don't get me wrong here. I am not trying to defend SIM by deprecating the features you demand. They are very important and highly desirable features in a DBMS (the fact that they are amazingly difficult to do well is of no concern to the user). Their absence in SIM is of ongoing concern to us. Furthermore it is far from satisfying to be able to insist that, SIM fits into a strict, minimalist definition of a DBMS if it lacks features that are typically associated with DBMS's. One of the primary reasons they are not in at this stage is that, as you pointed out so well, the primary focus of SIM has always been performance and scalability; and all of the aforementioned features can have a significant impact on performance if implemented naively (transaction support, in particular, is an onerous requirement, though by no means untenable). SIM is not a full featured DBMS. But it is not a mere informaton retrieval system either. It does support recovery (though not full transaction support), it does support concurrency, and it can be coerced to support referential integrity. It also bears mentioning that you don't have to talk out to an RDBMS to do any of these things. In fact the only use I have heard of for our ODBC capability is one client who wanted to access a personnel database for authentication purposes (it had nothing to with the database server per se). I guess this all boils down to what's in a name. At the end of the day, it is far more important to know what a product does and does not do than what you call it. > This isn't to knock SIM; anyone who needs a flexible information > retrieval system should be taking a very serious look at it. The > Z39.50 support alone puts it way ahead of the market as far as I'm > concerned. But I don't think SIM is evidence that there are DBMS > systems that handle SGML/XML well; I don't think they do. Oracle > may very well be getting there with its latest release, but I > suspect there's still a lot of work to be done there. I am sceptical that any RDBMS vendor can come to the party in terms of performance. Past attempts to try to force text into a relational, table or object based paradigm have not reaped great success (Oracle's ConText comes to mind as an example of how forcing a square peg into a round hole requires sacrificing the edges of performance). I would be surprised if any of the major database vendors would be prepared to venture away from their core competency (the relational model) to address the performance issues. But why parse XML to split it up into tables when you can store the XML directly? Why build thousands of index entries to system generated element ID's so that you can do join's to build up an XML fragment, when you can build a single index and pull the fragment in its entirety out of the document from which it comes? Why use inferior content indexing technology taking up to 10 to 20 times the size of the data being indexed when you can use compressed inverted files which take between 15% (document level index) and 50% (multi-level word position index) the size of the data? And all this with faster update speed than many standard text retrieval systems. There is an additional overhead in the relational paradigm which has nothing to do with transactions, concurrency control, or referential integrity checking. That cost is that relational tables do not map cleanly onto hierarchical documents (or data collections to pick up on another thread). Every fragment you insert, update, or remove has to be taken apart to map it onto some underlying representation, modified piece by piece, and then reassembled to be delivered. I strongly disagree that SIM doesn't handle SGML/XML well. In the five years of successfully selling SIM, no customer has ever replaced SIM with another product. In fact none of them have even mentioned to us that they ever considered replacing SIM. This in itself is remarkable given that, because our customers use SIM to store their SGML/XML natively, they can get the data out of SIM much more easily than if it were mapped onto some proprietary internal database format. People buy SIM because it is flexible enough to do whatever they need to do with their XML/SGML. It doesn't force them to adopt a non-XML/SGML approach. It doesn't force them to translate their data into some proprietary format in order to interact with the data. It deals directly with the XML. Precisely what the original post was asking for, in fact. Cheers, Marcelo P.S.: Some thanks go to my colleague, Tim Arnold-Moore, for providing some of the content (including the closing) for this article. -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 6 11:31:29 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au> References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <19990306154022.B22308@io.mds.rmit.edu.au> Message-ID: <14049.4226.895273.99370@localhost.localdomain> Marcelo Cantos writes: > More important than any specifics, however, is the issue of what you > call a DBMS. To me, a DBMS is a database management system (seems > painfully obvious, but I think it bears repeating). You may argue > that a product is not a DBMS if it does not support feature X [...] A DBMS is something that manages data *and* passes the ACID test (Atomicity, Consistency, Isolation and Durability). This isn't a question of "I want feature X" -- the ACID test is what distinguishes a DBMS from, say, the Unix file system (which can also manage data). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Sat Mar 6 15:09:50 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <19990306154022.B22308@io.mds.rmit.edu.au> <14049.4226.895273.99370@localhost.localdomain> Message-ID: <36E14530.10423DC8@fiduciary.com> David Megginson wrote: > A DBMS is something that manages data *and* passes the ACID test > (Atomicity, Consistency, Isolation and Durability). This isn't a > question of "I want feature X" -- the ACID test is what distinguishes > a DBMS from, say, the Unix file system (which can also manage data). I am going to be the old fogey here, with experience of databases going back to IMS and R: ACID is (one possible) test of a transaction processor, not of a database. It was precisely the misguided emphasis upon ACID qualities which bloated the relational model into the transaction-oriented behemoths sold today. For at least ten years we have tried to undo that direction by re-imagining the original relational concept as the data warehouse and, when that too became too bloated, the data mart. There is an opportunity with a true XML database to describe, and implement, transactions without surrendering to the siren song of two-phase commit. The key is understanding that there is no obvious or natural boundary to a transaction. Because of the inherent differences in the perspective of every participant to a transaction, each or them will describe a different set of elements to the transaction and different specific relationships among them. In the data world there is no omniscience which sees the transaction whole: to imagine it as a single, identifiably boundable unit is to deprecate the central task of each participant--to construct a transaction which is understandable to and processable by his own system. That is an ongoing implementational task, not just a conceptual one. In the real world it resolves to this: how do I get what I have to become what you need? What I have and what you need are both structures, and the two of them will incorporate some set of similar or analogous elements, which gives them the common terms on which they can define and communicate the transaction which they are attempting to execute. The definition and the maintenance of each of these structures is the role of the database. Yet each of those structures is peculiarly unique, and both are ephemeral in the specific terms of the transaction which they facilitate. Yes, the transaction, once executed, endures. But the terms in which that durability is communicated--indeed the very substance as which it is preserved--may be utterly different in the systems (and, I would hope, in the databases) of each of the participants. Precisely what each of those systems, or databases, does not exhibit are the ACID qualities through which some would hope to define the identity, uniqueness and permanence of that transaction. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Sat Mar 6 17:19:49 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com> <19990306153959.A22308@io.mds.rmit.edu.au> Message-ID: <36E1639D.FDA85E9C@fiduciary.com> Marcelo Cantos wrote: > Thank you, Walter for the erudite response. I am left in a bit of > quandary as to how or even whether to respond. This is in large part > due to the fact that, while your post was in response to mine, it is > not immediately clear to me whether you are addressing my comments > specifically or rather the general theme of this thread. Thank you for your kind words. I will confess that much of my post was addressed to the general theme of the thread. > On this point, I must make it quite clear that SIM is _not_ an XML > front end to a data store. It is an XML (etc.) document repository. My naive reading of the SIM materials on your website leads me to this conclusion. I am glad to have your confirmation of it. As a document repository SIM may more nearly compete with the 'grove minder' paradigm than with what I characterize as an XML database. > One additional, crucial point is that SIM _is_ extensible (though I > will qualify this presently). It can be defined to accept markup to > any degree of strictness or laxity (within the bounds of > well-formedness or validity, of course). It can be setup to accept > any and all markup and do _something_ intelligent with it. It can > also be configured to make stringent demands (well in excess of the > DTD, both with respect to strictness and complexity of constraints) of > its inputs. Granted. It is simply that I (perhaps perversely) have defined an XML database engine as one which implements XML markup. My XML database engine is driven by the markup and must rework the effective schema and re-cast its processing behavior in sync with changes to the document instance markup. > Now, by way of qualification, SIM does not provide free-form runtime > extensibility (runtime from the administrator's perspective, not > ours). Rather it provides the application developer with the > requisite tools to define, at design time, what structures will be > supported. For instance, you cannot, with SIM, perform queries such > as, "find me all sections containing subsections with an attribute of > security="public" and at least one paragraph with fewer than four > words in it" The semantic complexity of such a query is beyond the > scope of our product. However, if one were to know in advance that > queries about the minimum paragraph length in public subsections will > be commonplace in the particular application one is developing, then > SIM could, at design time, be told to create an appropriate index and > then the above query could, indeed, be performed. > > In short, SIM _is_ extensible, but the extensibility is bound somewhat > earlier than runtime. In practice, clients never complain about this > quality. In fact, it is usually a benefit rather than a hindrance, > for the same reason that compile time type checking is a good thing to have in a programming > language. All of these are commendable design decisions. They are not, IMHO, realizations of the unique qualities and potential of XML. On that, reasonable people may differ. > I also take issue with Walter's remark that an XML database should be > manipulated by and defined through the medium of XML. This sounds > analogous to suggesting that relational databases should be defined > and manipulated by markup. No, by relational schema, as you acknowledge in the next line. > Now, it is true that relational schema > are, themselves, typically stored as relations (one will, for example, > find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc. > inside a database). However, it seems to me patently absurd to > suggest that SQL (whether DML or DDL) be expressed in terms of tuples > and relations. Now, while it does not seem likewise absurd to suggest > that XML queries and data definition constructs be defined as XML, the > truth of such a suggestion is anything but self-evident. Why should > one not use an SQL-like language to define and query XML databases? > There may or may not be merit in such an approach, but it seems no > more or less appropriate than a query/data definition language cast in > XML. Indeed, many of the query language position papers at W3C do not > use XML syntax. Data definition and query languages are > meta-constructs. They are not part of the data, but rather operate on > the data and structures. This suggests that while it may be possible > to fold the system in on itself by expressing meta-structure as data, > it would be unwise to proceed down this path in _a priori_ fashion By following the path indicated by just such an a priori judgment I arrived at the conclusions which I have shared with you. I am implementing the resulting design and, I suppose, the almighty market will render the final verdict. > The serious user of XML does not have a heterogeneous collection of > vaguely defined documents with a motley crew of DTD's and well-formed > markup. That is exactly what I (and my customers, once we re-state their documents in various legacy forms as XML) have to deal with. We process settlements of cross-border trades and the regulatory reporting required by multiple overlapping legal jurisdictions. If I have advice of a trade execution in the customary form used in, say, Djakarta, and the interested parties to whom I must report it are a UK fiduciary, a Swiss depot bank, a US money manager and a Hong Kong broker, as well as the various regulators which the involvement of each of those parties entails, I must (in my opinion) drive the entire process off of a properly marked up document which succinctly expresses the facts of the transaction reported. That document, received by each of the interested parties, must be instantiated in the system--and I would hope the database--of each in a form which may well require re-writing the schema upon which it will be realized. > Most users have a well defined data set for which they want > to define efficient structures for storage and retrieval (if they > aren't interested in efficiency then their problem isn't particularly > interesting -- any tool will do). In the few cases where they do have > arbitrary structure to deal with, more often than not they are only > interested in the content and are likely to throw the structure away. As I hope the use case fragment above illustrates, users may have very well defined structures, well-suited to their specific needs. Those structures, however, may not accommodate the instance documents which they receive as input data and which, in the real-world examples I am familiar with, may exhibit differences of data structure on each occasion. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 6 20:47:45 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: <003b01be6811$cc5974e0$c9a8a8c0@thing2> Seems like a good fit for filters--drop what you don't want, transform the rest as needed. Bill -----Original Message----- From: David Brownell To: MikeDacon@aol.com Cc: xml-dev@ic.ac.uk Date: Friday, March 05, 1999 5:58 PM Subject: Re: ModSax Suggestion >Interesting suggestion for a big hole in the parts of >the Java API set that are more or less "standard" at >this poit -- SAX and DOM. > >One comment though: I've found that it's important to >be able to have options controlling how the DOM tree is >built. For example, whether to discard ignorable spaces, >or do namespace conformance enforcement, or try to get >CDATA sections (comments, etc). > >Accordingly, I think being able to do a bit more than >this will be important. > >- Dave > > > >MikeDacon@aol.com wrote: >> >> Hi Everyone, >> >> While SAX does a good job as an event-based interface >> to Parsers, it would be nice to add a few methods to >> receive a DOM representation back from a reference to an org.xml.sax.Parser. >> >> Something like: >> >> org.w3c.dom.Document parse(InputSource is, boolean events) throws >> SAXException; >> org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws >> SAXException; >> /* the events boolean would be to turn on/off event calls. */ >> >> If a SAXDriver did not want to produce a DOM, it could either simply >> return a null or a method added like: >> >> boolean isDomCapable(); >> >> The above would let me use the ParserFactory to seamlessly switch >> between Parser implementations and get a DOM tree without building >> one myself. It is fruitless for me to build a DOM tree when almost all >> the parser implementations provide that ability. I just want a way to get >> at that functionality in a simple and standard way (thus SAX). >> >> Thoughts? >> >> - Mike >> ----------------------------------------------- >> Michael C. Daconta >> Author of Java 2 and JavaScript for C/C++ Programmers >> Author of C++ Pointers and Dynamic Memory Management >> Sun Certified Java Programmer and Developer >> http://www.gosynergy.com >> >> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >> (un)subscribe xml-dev >> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >> subscribe xml-dev-digest >> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 6 20:57:14 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:42 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <004801be6813$22d36820$c9a8a8c0@thing2> From: Daniel Veillard > Hum, I have been following the streaming/fragment thread. However I have >the feeling that even multiple fragment body extensions would not solve >the problem you were facing. If I didn't get the discussion wrong, it seems >that you rather tried to make one very big (i.e. stream) document from >multiple sources while the scope of the fragment work was just the opposite, >i.e. how to extract and ship a piece of a very big document. Actually, it sounds to me like the seperation of physical and logical layers. On the one hand, I have some data to move. Multiple documents, multiple fragements, whatever. (logical) On the other hand, I have a stream. It can pass any number of documents or fragments. (physical) The fragments in the stream could be all from one document or from different queries on different documents or from one query applied to a set of documents. It shouldn't matter. And how one might reassemble fragments back into a large document is another problem, though the stream should provide sufficient information to do so. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Mar 7 10:38:43 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:42 2004 Subject: New expat test release and FAQ Message-ID: <36E253C5.E4301749@jclark.com> A new expat test release is available at: ftp://ftp.jclark.com/pub/test/expat.zip This adds handlers for namespace declarations; when namespace processing is enabled these provide information about xmlns attributes. This release also fixes a few bugs. I've also started an expat FAQ at: http://www.jclark.com/xml/expatfaq.html Suggestions for additions are welcome. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Mar 7 13:02:43 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: Hi Dave, In a message dated 3/5/99 5:40:50 PM Eastern Standard Time, db@eng.sun.com writes: > Interesting suggestion for a big hole in the parts of > the Java API set that are more or less "standard" at > this poit -- SAX and DOM. > > One comment though: I've found that it's important to > be able to have options controlling how the DOM tree is > built. For example, whether to discard ignorable spaces, > or do namespace conformance enforcement, or try to get > CDATA sections (comments, etc). > I agree with that. I think all that is possible while still retaining a minimalist design philosophy. Something like: void setDOMFeature(String feature, boolean val); boolean get DOMFeature(String feature); That way via an extensible common set of text properties we can add properties as the need arises without expanding the API. Looking forward to progress on the Java XML API. BTW, Dave, are you going to do a "Birds of a Feather" session on XML at this years JavaOne? I think that could be valuable. Best wishes, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Mar 7 13:15:25 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: <4d147c5b.36e27b77@aol.com> In a message dated 3/6/99 4:03:25 PM Eastern Standard Time, b.laforge@jxml.com writes: > Seems like a good fit for filters--drop what you don't > want, transform the rest as needed. > I think Bill has brought up an excellent point. In fact, I like that suggestion better than my setFeature() method. It seems to me that the central tension of API design is whether to expand the API or relegate functionality to be handled by a higher-level layer of software. In my original suggestion, on getting access to a DOM it seems appropriate that be part of SAX (a low-layer) while transforming the resultant tree be relegated to a higher level layer. While I have certainly written gobs of enterprise level software, my experience with formal APIs is limited -- does this track with those of you with more API building experience? Best wishes, - Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sun Mar 7 16:10:02 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:42 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: Bill wrote: > From: Daniel Veillard > > Hum, I have been following the streaming/fragment thread. > > However I have the feeling that even multiple fragment body > > extensions would not solve the problem you were facing. If > > I didn't get the discussion wrong, it seems that you rather > > tried to make one very big (i.e. stream) document from > > multiple sources while the scope of the fragment work was > > just the opposite, i.e. how to extract and ship a piece of > > a very big document. > [snip] > And how one might reassemble fragments back into a large > document is another > problem, though the stream should provide sufficient > information to do so. I think Daniel's point is simply that in many situations you may not want to reconstruct the 'large' document. The fragment work seems to relate to providing context to a fragment, such that reasonable work can be done on it. That's not the same - although related - as shipping one great big document in a number of packages. On the theme of multi-fragments, I think the simplest increment from where we are now is to allow for the results set of a query that spans different levels of a tree. I was previously exporting from queries using a simple wrapper, but when I saw the fragment group's work decided to use it with a very slight modification. The change is an obvious one - and I think someone else suggested it on this list the other day - but I wonder if anyone can see any pitfalls. I've enclosed four sets of query results for those who might be interested in approving/criticising my approach. The queries are: http://[server]/documents/ysArticle[author=Ruth] http://[server]/documents/ysArticle[author=Ruth]/ArticleText http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText[ID=1 ] [Ignore non-quoted stuff, etc., it's still work in progress!] Although the first few actually return pretty much the same information, they differ in where the division between context and requested data is. The first will return all articles by Ruth in their entirety, and so only needs one 'fragbody' element. The second returns the same data, but the articles themselves are now provided only as context, and the containers of the text become the top level of the fragments. This therefore requires two 'fragbody' elements, since there are two articles by Ruth. (Actually it could be one, but because there's an article between the two that is *not* by Ruth, even though it's not getting returned it messes up my merging code!) The third query is not much different from number two, but pushes one more level of data up into the 'context' information. The final query is the one I'm most interested in getting feedback on, in particular on whether I have the context information right. I think the fragment document is a little ambiguous on what level of detail to put in. Some examples in the doc. do what I have done - put in all siblings of any element that is an ancestor of the ones we're interested in - but one of them doesn't. Of course it is partly application-dependent so I'm not that bothered. Comments? Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net -------------- next part -------------- A non-text attachment was scrubbed... Name: q1.xml Type: application/octet-stream Size: 1565 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q1.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q2.xml Type: application/octet-stream Size: 956 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q2.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q3.xml Type: application/octet-stream Size: 1277 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q3.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q4.xml Type: application/octet-stream Size: 3920 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q4.obj From david at megginson.com Sun Mar 7 23:57:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:42 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <14051.3215.196642.22571@localhost.localdomain> What: Four proposed predefined features for ModSAX Action: Please read and comment (especially to propose core features I've missed) Last month, I posted a proposal [1] for a backwards-compatible SAX layer called ModSAX, which will allow parser and filter writers to extend SAX and application writers to discover what extensions exist, all in a well-defined and predictable way. The relevant part of that interface for this posting is the following method in ModParser (which extends org.xml.sax.Parser): public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; The value of featureID will in some way piggyback on DNS, either by using URIs or by using names similar to Java packages. Although people will be allowed (and encouraged) to invent their own features, I'd like to predefine a core set of features for the next SAX release. Here's what I've thought of so far: 1. http://xml.org/sax/features/validation True means validate, false means don't validate. 2. http://xml.org/sax/features/external-entities True means expand external text entities, false means don't expand external text entities. 3. http://xml.org/sax/features/namespaces True means perform namespace processing -- munge element and attribute names and remove namespace declaration attributes -- and false means don't perform namespace processing. 4. http://xml.org/sax/features/unbuffered-input True means ensure that the parser does not buffer input from a Reader or InputStream supplied by the application (actually, one-character look-ahead will usually be required); false means do not ensure that the parser does not buffer input. This feature might be useful for reading multiple documents from a single stream. No SAX parsers will be *required* to support any of these -- they can simply throw a SAXNotSupportedException for any request (as they should for any other unrecognised feature request). The earliest ModSAX parser will probably be a general-purpose SAX 1.0 Parser adapter, and that will certainly not be able to do anything useful with these. Unlike parsers, filters will ordinarily pass unrecognised feature requests on up the chain of responsibility. Examples -------- If an application wants to ensure that the SAX parser is performing validation, it can use try { parser.setFeature("http://xml.org/sax/features/validation", true); } catch (SAXNotSupportedException e) { // ... } The parser may throw an exception for either of two reasons: 1. it cannot validation; or 2. it does not recognise the property. If the application wants to determine which of the two is the case, then it can try the following: try { parser.setFeature("http://xml.org/sax/features/validation", false); } catch (SAXNotSupportedException e) { // ... } If the parser throws an exception again, then it does not recognise the property name (in other words, it may or may not perform validation, and the application has no way to tell); if the parser does not throw and exception, then it simply does not support validation. [1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 8 02:03:56 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E32900.BBDF43C0@jclark.com> David Megginson wrote: > 2. http://xml.org/sax/features/external-entities > True means expand external text entities, false means don't expand > external text entities. I would suggest distinguishing the expansion of external parameter entities (which would include the external DTD subset) from the expansion of external general entities. I can easily imagine wanting to expand external general entities declared in the internal subset, but not wanting to read an external DTD. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 8 02:04:22 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E329F5.76A50E09@jclark.com> David Megginson wrote: > The parser may throw an exception for either of two reasons: > > 1. it cannot validation; or > > 2. it does not recognise the property. > > If the application wants to determine which of the two is the case, > then it can try the following: > > try { > parser.setFeature("http://xml.org/sax/features/validation", false); > } catch (SAXNotSupportedException e) { > // ... > } > > If the parser throws an exception again, then it does not recognise > the property name (in other words, it may or may not perform > validation, and the application has no way to tell); if the parser > does not throw and exception, then it simply does not support > validation. Wouldn't it be simpler to throw different type of exception in these two cases? You could have a SAXNotRecognizedException that extends SAXNotSupportedException, and say that parsers should throw SAXNotRecognizedException when the reason they don't support a feature is that they do not recognize the feature. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 02:32:27 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <128b4bc2.36e3366b@aol.com> Hi Dave, Before responding to your specific proposal ... I do not understand why you are creating a new interface like ModParser instead of just evolving the Parser interface itself. Personally, while I know full well what it would mean to implement Parser -- a "ModParser" is just plain confusing. Five years from now, someone should not have to know the history of SAX to understand the terminology. Now to the Predefined features... In a message dated 3/7/99 7:13:19 PM Eastern Standard Time, david@megginson.com writes: > What: Four proposed predefined features for ModSAX > Action: Please read and comment (especially to propose core features > I've missed) > > Last month, I posted a proposal [1] for a backwards-compatible SAX > layer called ModSAX, which will allow parser and filter writers to > extend SAX and application writers to discover what extensions exist, > all in a well-defined and predictable way. I like the idea of SAX filters but still feel that you should allow access to a DOM Document if the implementing Parser can supply one. I won't restate the suggestion here as it was covered in a previous email. However; that could greatly simplify a filter-writer's job. > > The relevant part of that interface for this posting is the following > method in ModParser (which extends org.xml.sax.Parser): > > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > The value of featureID will in some way piggyback on DNS, either by > using URIs or by using names similar to Java packages. Although > people will be allowed (and encouraged) to invent their own features, > I'd like to predefine a core set of features for the next SAX > release. Here's what I've thought of so far: Since some finite set of SAX features will not approach a global naming problem, I strongly urge not to use a URI. If a package name scheme is to be used, something like "sax.feature.validation". It would also be nice to provide one word String constants for the standard features. Best wishes, - Mike Daconta (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 02:56:51 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <001e01be690e$8913fe00$c9a8a8c0@thing2> From: MikeDacon@aol.com >I like the idea of SAX filters but still feel that you should allow >access to a DOM Document if the implementing Parser can supply one. >I won't restate the suggestion here as it was covered in a previous email. >However; that could greatly simplify a filter-writer's job. Well, that might depend on the job of the filter. You may want to use a filter to prune out the parts of the document you are not interested in BEFORE the DOM is built. In general, I see several places where you might want to use a filter: o Transform events from a parser into something to be output. o Transform events from a parser before being accessed by an application. o Between a parser and the DOM. o Transform events from a DOM walker into something to be output. Note that in the last case, if the DOM walker shares its internal state (position in the DOM tree) with the filters that come after it (using something like MDSAX), we get a lot of XSL-like capabilities. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 04:39:57 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <004b01be691c$f348fc40$c9a8a8c0@thing2> David, I am very much inclined to agree with you that the conservative approach taken in implementing SAX was necessary to its broad acceptance at that time. However, broad acceptance of a SAX upgrade may require a different approach. For one thing, the very success of SAX has itself changed things. The primary requirement is backward compatibility for both parsers and applications. A second requirement is that the upgrade not be conservative, but that it be a significant enhancement from a wide range of perspectives. The upgrade needs to be worth doing, but for more than one reason. Feature negotiation alone is not quite enough. I'm sure you know the kinds of things I'm looking for: o Event objects for one. o A way to specify a filter to a DOM-building-parser is another. o Better integration with the DOM in general. I'm sure others have their own feature list. We need to define a collection of new capabilities that have wide appeal, together with an implementation strategy which provides full backward compatibility. And for this group, it needs to be something that can be implemented cleanly. I still feel like a newbie here. I wasn't here when SAX was done. But I would hate to see the initiative lost to the traditional standards bodies. As I see it, there are two advantages to doing the work on this list: 1. It is open to individuals. The cost to participate is measured only in the time it takes. 2. This is the world's toughest bunch of critics. The folks here plan to implement the proposals themselves. And any proposal that isn't clean is going to be revised until it can be easily implemented. And as much as the first point is what allows me to participate, it is the second point that is the real winner. A standards body whose participants are largely from large companies have more to gain from a spec that is difficult to implement--it limits the competition. So that's why I'm butting in here. I think an open standards process is important for individuals and small companies. We need to do what we can to keep the ball rolling here. Bill From: David Megginson >What: Four proposed predefined features for ModSAX >Action: Please read and comment (especially to propose core features > I've missed) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shinichiro.hamada at toshiba.co.jp Mon Mar 8 06:57:59 1999 From: shinichiro.hamada at toshiba.co.jp (Shinichiro HAMADA) Date: Mon Jun 7 17:09:43 2004 Subject: Accessing DTD info. in IE5 Message-ID: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp> Hello. >Is DTD information accessable through IE5 DOM? I took is as granted because >I could do it with old MSXML for java used in IE4. However, when I really >wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD >information exposed in IE5 DOM? I wonder if what you want to know is IXMLDOMDocument::get_doctype: http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a sp or I've misunderstood your question? -- Shinichiro HAMADA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From johnh at erin.gov.au Mon Mar 8 06:58:56 1999 From: johnh at erin.gov.au (John Hockaday) Date: Mon Jun 7 17:09:43 2004 Subject: Mapping elements in architectural forms Message-ID: <199903080655.RAA21026@eos.erin.gov.au> Hi, I am using architectural forms to map elements from a client document instance of a client DTD to a base document of a base DTD using the SP software by James Clark. The problem is that the structure of the elements and sub-elements in the client document do not exactly match the base DTDs elements and sub-elements and I don't know how to relate this in the mapping DTD. For example, sub-elements "b" and "c" occur in element "a" in the client DTD but in the base DTD sub-elements "b" occur in element "a" but sub-element "c" occurs in element "d". Client Base ====== ==== If I map "a" to "a", "b" to "b" and "c" to "c" in the mapping DTD the parser gives an error that "a" has not been finished and that "c" should not occur here in the base document. Does anyone know how I can map the client elements to the base elements in the mapping DTD to fix this problem? ___________________________________________________________________________ John Hockaday - Systems Officer GPO Box 787 email: johnh@erin.gov.au Canberra ACT 2601 phone: +61 2 6274 1173 fax: +61 2 6274 1333 Australia URL:http://www.environment.gov.au/ ERIN Environmental Resources Information Network ERIN ___________________________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wendy.cameron at qr.com.au Mon Mar 8 07:25:52 1999 From: wendy.cameron at qr.com.au (Wendy Cameron) Date: Mon Jun 7 17:09:43 2004 Subject: XSL Problem References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <00f101be6930$a123feb0$c62b580a@qrail.com.au> Ok I have I am trying to select all 3 nodes and orger by att1 but display different information depending on what type of node it is? Does anyone have any idea how i would do this I have tried ..... But this doesnt test if the current node is of type nodeType1 Help!!! Regards Wendy xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From zmin at atpage.com Mon Mar 8 07:28:47 1999 From: zmin at atpage.com (min zheng) Date: Mon Jun 7 17:09:43 2004 Subject: Accessing DTD info. in IE5 References: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp> Message-ID: <002d01be6935$e73a66a0$f66f6f0a@atpage> What I want is the DTD (or Schema) rules telling me what nodes are allowed in an element. The get_doctype mothod only gives the doctype declaration. There is no way (as far as I know) to access element rules from there. Thanks anyway, Min ----- Original Message ----- From: Shinichiro HAMADA To: Sent: Sunday, March 07, 1999 10:56 PM Subject: RE: Accessing DTD info. in IE5 > Hello. > > >Is DTD information accessable through IE5 DOM? I took is as granted because > >I could do it with old MSXML for java used in IE4. However, when I really > >wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD > >information exposed in IE5 DOM? > > I wonder if what you want to know is IXMLDOMDocument::get_doctype: > > http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a > sp > > or I've misunderstood your question? > > -- > Shinichiro HAMADA > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 10:29:39 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <14051.3215.196642.22571@localhost.localdomain> References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: * David Megginson | | The value of featureID will in some way piggyback on DNS, either by | using URIs or by using names similar to Java packages. I think we should use package-like names. Using protocol prefixes seems to me both potentially confusing, slightly obfuscating and I don't see the merit in it over a package-like scheme. I much prefer org.xml.sax.features.validation over http://xml.org/sax/features/validation. | 2. http://xml.org/sax/features/external-entities I agree with James that separating general entities and parameter entities is a good idea. | 4. http://xml.org/sax/features/unbuffered-input I'm not sure I see the merit of this. Maybe we should skip this? A suggestion of my own: org.xml.sax.features.catalog True means read the default catalog file, whether that is located via an environment variable, a Java property or something else. OpenXML, XML Parser for Java (xml4j) and xmlproc already support catalogs, and might find this useful. xmlproc certainly will. | No SAX parsers will be *required* to support any of these -- they | can simply throw a SAXNotSupportedException for any request I also agree with James that a separate unrecognized-exception is a good idea. | Unlike parsers, filters will ordinarily pass unrecognised feature | requests on up the chain of responsibility. Good point. This implies that filters need references in both directions, that is, both to the event source and to the event receiver, thus resolving a question that was previously discussed here. | [1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html Hmmm. Wouldn't this reference be more correct? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 10:40:23 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2> References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: * Bill la Forge | | The upgrade needs to be worth doing, but for more than one reason. I agree that it needs to be worth doing, but to me what has been proposed here certainly sounds like it is enough. (Remember, parameter setting, handler extensibility, filters, namespaces, lexical information and DTD information are probably all in the pipeline.) | I'm sure you know the kinds of things I'm looking for: | o Event objects for one. On this point I agree with what David will probably say: this belongs on a higher level. If you want this functionality, make a value-adding layer on top of SAX 1.1. There's no loss in that, since you can implement this once for all SAX-aware parsers with hardly any performance penalties. (This is why I agree with David: this is the kind of benefit that being ultra low-level buys us.) | o A way to specify a filter to a DOM-building-parser is another. We certainly need this, but I don't see how this can usefully be part of SAX. SAX is at a lower level than the DOM and so should certainly be designed for a DOM layer to fit nicely on top, but there should be no dependencies, I think. In other words, this is something that either the DOM or the parsers will have to deal with in a sensible fashion. Taking a ModParser as an argument to DOM building would perhaps be the best way to do this. However, I don't see the harm in someone sitting down to write a recommendation to DOM parser writers for how to do this and why it's useful. | So that's why I'm butting in here. I think an open standards process | is important for individuals and small companies. We need to do what | we can to keep the ball rolling here. We are certainly in heartfelt agreement here. :) --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:30:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <128b4bc2.36e3366b@aol.com> References: <128b4bc2.36e3366b@aol.com> Message-ID: <14051.45935.800104.922834@localhost.localdomain> MikeDacon@aol.com writes: > I like the idea of SAX filters but still feel that you should allow > access to a DOM Document if the implementing Parser can supply one. > I won't restate the suggestion here as it was covered in a previous > email. However; that could greatly simplify a filter-writer's job. I have an idea for how we can handle that (and other, similar problems), but I'll cover it in a separate posting (it's still brewing a bit). > Since some finite set of SAX features will not approach a global naming > problem, I strongly urge not to use a URI. I disagree here -- if third parties want to be able to define feature names, they need a way to avoid collision (i.e. we want to make certain that both Oracle and Sun can define properties like 'normalize' without blowing up the whole system). That said, the Java package naming scheme also provides DNS-based uniqueness, as in 'org.xml.sax.features.validation'. It's simply a matter of taste: - org.xml.sax.features.validation is more of a Java flavour. - http://xml.org/sax/features/validation is more of an XML/Namespaces flavour All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:34:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2> References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <14051.46235.905949.308401@localhost.localdomain> Bill la Forge writes: > The upgrade needs to be worth doing, but for more than one > reason. Feature negotiation alone is not quite enough. Yes, but my original proposal was not limited to feature negotiation -- it also included the ability to add and negotiate new handler types at runtime. People will upgrade because they want to use the new handlers that are implemented with ModSAX, not because of any elegance or inelegance in ModSAX itself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Mon Mar 8 11:41:43 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> > What: Four proposed predefined features for ModSAX > Action: Please read and comment (especially to propose core features > I've missed) > Could I add a plea for another optional feature: http://xml.org/sax/features/normalisePCDATA whose effect is to ensure that successive calls to supply character data are combined into a single call. The reason for this is that it's very common for applications to assume the parser won't split character data, an incorrect assumption but one that will survive most testing. Actually I think using "http://" names for things that have nothing to do with HTTP protocol is very bad form. (Apart from anything else, my mail client encourages my to click on them to see what's there.) "org.xml.sax.features.normalisePCDATA" is much more sensible. If you want a URN, choose a protocol name other than http. Another rather trivial convenience feature I'd like added to SAX is the ability for InputSource to accept a File (as well as a URL, etc). Though the need for this has declined with Java 2. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:42:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E32900.BBDF43C0@jclark.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E32900.BBDF43C0@jclark.com> Message-ID: <14051.46547.366706.485764@localhost.localdomain> James Clark writes: > I would suggest distinguishing the expansion of external parameter > entities (which would include the external DTD subset) from the > expansion of external general entities. I can easily imagine > wanting to expand external general entities declared in the > internal subset, but not wanting to read an external DTD. I agree. Here's the new core feature list: http://xml.org/sax/features/validation http://xml.org/sax/features/external-general-entities http://xml.org/sax/features/external-parameter-entities http://xml.org/sax/features/namespaces http://xml.org/sax/features/unbuffered-input All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:43:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E329F5.76A50E09@jclark.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E329F5.76A50E09@jclark.com> Message-ID: <14051.46946.186235.431488@localhost.localdomain> James Clark writes: > Wouldn't it be simpler to throw different type of exception in these two > cases? You could have a SAXNotRecognizedException that extends > SAXNotSupportedException, and say that parsers should throw > SAXNotRecognizedException when the reason they don't support a feature > is that they do not recognize the feature. Yes, I agree. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:51:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <14051.46670.687235.664451@localhost.localdomain> What: Additions to ModParser interface I'm proposing a couple of additions to the ModParser interface: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; public abstract void set (String infoID, Object prop) throws SAXNotSupportedException; public abstract Object get (String infoID) throws SAXNotSupportedException; } These allow you to do interesting things like parser.set("http://www.foo.com/props/textfilter", filter); or try { Node node = parser.get("http://xml.org/sax/props/dom-node"); } catch (SAXNotRecognizedException e1) { // doesn't know about DOM processing... } catch (SAXNotSupportedException e2) { // knows about DOM processing, but not doing it... } Again, it's a little sloppy as an interface, but it's beautifully extensible and it supports filters nicely (if there are other filters between the DOM iterator and the application, it will still work). Note that strictly speaking, now, setHandler() and setFeature() are no longer primitives, since they could both be implemented in terms of set(), but I think that the extra type checking is worthwhile in those cases. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:54:40 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> References: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> Message-ID: <14051.47534.65569.354415@localhost.localdomain> Kay Michael writes: > Could I add a plea for another optional feature: > http://xml.org/sax/features/normalisePCDATA Yes, this is especially useful for building a DOM as well. I've added it to the list of core features: http://xml.org/sax/features/validation http://xml.org/sax/features/external-general-entities http://xml.org/sax/features/external-parameter-entities http://xml.org/sax/features/namespaces http://xml.org/sax/features/unbuffered-input http://xml.org/sax/features/normalize-text Remember that parser will not be required to support any of these. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Mon Mar 8 12:33:14 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <073f01be695f$d8dc04e0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 07 March 1999 23:56 Subject: SAX RFD: ModSAX Predefined Features >What: Four proposed predefined features for ModSAX >Action: Please read and comment (especially to propose core features > I've missed) > >Last month, I posted a proposal [1] for a backwards-compatible SAX >layer called ModSAX, which will allow parser and filter writers to >extend SAX and application writers to discover what extensions exist, >all in a well-defined and predictable way. It seems to me that there are two kinds of parser extensions: 1/ those that are static (i.e. must be established before the parser is used) 2/ those that are dynamic (i.e. they can be changed on the fly) An example of a static extension would be buffering. If the parser is buffering input then it is infeasible to change to unbuffered input in the middle of parsing the text. Switching from non validating to validating is problematic, insisting that a parser be able to do this would probably add unacceptable overhead to the non validating mode. I would suggest that the bulk of the extensions should be specified to the parserFactory and only a *very* limited number (if any at all) be specified to the instance of Parser. I would very much like a getFeature function which returns a value telling me if the feature is set or not. I'm also not very keen on the use of strings to specify the features. How about using instances of classes: in org.xml.sax public abstract class Feature { public Feature(boolean state) { this.state = state; } final boolean state; } public final class Validation extends Feature { public Validation(boolean state) { super(state); } } individual parser implementations would then be free to add their own extensions defined by classes that subclass org.xml.Feature - they could also contain parameters. setFeature would then take a single Feature parameter: xxx.setFeature(new org.xml.sax.Validation(true)); getFeature would take a Class parameter and return an instance of the class or null if the feature was unrecognised. org.xml.sax.Feature f = xxx.getFeature(org.xml.sax.Validation.class); if (f == null) // not supported if (f.state) // supported and switched on. non Java implementations would probably have to use a string instead of the Class parameter. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 13:13:13 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <14051.45935.800104.922834@localhost.localdomain> Message-ID: On Mon, 8 Mar 1999, David Megginson wrote: > MikeDacon@aol.com writes: > > Since some finite set of SAX features will not approach a global naming > > problem, I strongly urge not to use a URI. > > I disagree here -- if third parties want to be able to define feature > names, they need a way to avoid collision (i.e. we want to make > certain that both Oracle and Sun can define properties like > 'normalize' without blowing up the whole system). > > That said, the Java package naming scheme also provides DNS-based > uniqueness, as in 'org.xml.sax.features.validation'. It's simply a > matter of taste: > > - org.xml.sax.features.validation is more of a Java flavour. Yep... but might not feel so natural for developers working with versions of SAX translated for Perl, Python and so on. > - http://xml.org/sax/features/validation is more of an XML/Namespaces > flavour ...and RDF [1]. Giving interesting entities URIs makes them more fully a part of the Web, and means we can take advantage of URI-oriented metadata. Eg. you might search a software database for resources that were of type 'Perl Module' and that implemented the feature known as 'http://xml.org/sax/features/validation'. (There's already a Linux Packages Database[2] along similar lines...). I'm not claiming that this would be impossible using the Java naming scheme, just that a Web oriented approach might make it easier to do certain things... Dan [1] http://www.w3.org/TR/REC-rdf-syntax [2] http://rpmfind.net/linux/rpmfind/ -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9288478 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 13:39:47 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:44 2004 Subject: ModSAX addition, general property query Message-ID: <008c01be6967$da059720$c9a8a8c0@thing2> From: David Megginson > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > > public abstract Object get (String infoID) > throws SAXNotSupportedException; David, OK, this is more like it! You have now defined an interface which is broad enough to fit all of MDSAX under. Remember that filters also implement the parser interface. And so do DOMWalkers. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 13:43:01 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: Hi Bill, In a message dated 3/7/99 10:03:55 PM Eastern Standard Time, b.laforge@jxml.com writes: > Well, that might depend on the job of the filter. You may want to use a > filter > to prune out the parts of the document you are not interested in BEFORE > the DOM is built. I agree with you. I was not saying that access to the DOM was the only way to write a filter. Just that filters can be based on walking a DOM Document tree as you state below. > > In general, I see several places where you might want to use a filter: > > o Transform events from a parser into something to be output. > > o Transform events from a parser before being accessed by an > application. > > o Between a parser and the DOM. > > o Transform events from a DOM walker into something to be output. > Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 14:54:21 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <18b603b2.36e3e337@aol.com> Hi David, In a message dated 3/8/99 9:10:40 AM Eastern Standard Time, david@megginson.com writes: > What: Additions to ModParser interface > > I'm proposing a couple of additions to the ModParser interface: > > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > public abstract Object get (String infoID) > throws SAXNotSupportedException; > } > > These allow you to do interesting things like > > parser.set("http://www.foo.com/props/textfilter", filter); > > or > > try { > Node node = parser.get("http://xml.org/sax/props/dom-node"); > } catch (SAXNotRecognizedException e1) { > // doesn't know about DOM processing... > } catch (SAXNotSupportedException e2) { > // knows about DOM processing, but not doing it... > } > I think the success of a general set() and get() capability will be based on the creation of a good initial set of descriptors (what you called infoID) to get or set. So, in that vein, I have 2 comments: 1. I still strongly urge not to use a URI for a feature or infoID. These are not resource locations they are just a descriptive string. In fact, I bet that most parsers just implement your initial recommended set. 2. I'd recommend that constants be defined in the interface for the initial set of standard features and infoIDs. Something like: public static final String VALIDATE = "sax.feature.validation"; public static final String DOCUMENT = "sax.dom.Document"; Then I can do this: try { parser.setFeature(ModParser.VALIDATE, true); } catch (SAXNotRecognizedException e1) { // doesn't know about validation } catch (SAXNotSupportedException e2) { // Does not support validation } Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:05:47 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E3E712.D5556233@locke.ccil.org> David Megginson wrote: > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; I want to propose a restriction and an extension: 1) This method cannot be called after any other parser method has been invoked. 2) This method is allowed to throw a SAXNewParserException, which encapsulates a replacement parser. The application should use the parser inside the exception in place of the original parser. This allows parsers to push filters on top of themselves, which complements the ability of applications to push them. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:09:34 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E32900.BBDF43C0@jclark.com> Message-ID: <36E3E80B.C3E55F16@locke.ccil.org> James Clark scripsit: > I can easily imagine wanting to > expand external general entities declared in the internal subset, but > not wanting to read an external DTD. Or, indeed, the converse: I might want to get the whole DTD but make my own decisions about loading external general entities. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 15:15:47 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <18b603b2.36e3e337@aol.com> References: <18b603b2.36e3e337@aol.com> Message-ID: <14051.59370.316671.640337@localhost.localdomain> MikeDacon@aol.com writes: > 1. I still strongly urge not to use a URI for a feature or infoID. > These are not resource locations they are just a descriptive > string. In fact, I bet that most parsers just implement your > initial recommended set. Yes, but what about filters that perform specialised actions? And what about adding support (stable or experimental) for new XML-related features like schemas, datatyping, and linking as they become available? The problem with SAX 1.0 is that it froze the XML status quo of about a year ago, and many interesting things have happened since then; with ModSAX, I'd like to leave the API open for two reasons: 1. so that we can extend it without breaking existing implementations; and 2. so that people can experiment with different ways of supporting new features within the SAX framework. As I wrote before, it doesn't much matter whether we use Java property names incorporating domain names (like 'org.xml.sax.features.validation') or URIs (like 'http://xml.org/sax/features/validation'), as long as we have the ability for people to create new names without fear of collision. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:18:11 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <36E3E967.F0D6690B@locke.ccil.org> Bill la Forge wrote: > o Event objects for one. But event objects are very easy to build on top of the existing SAX. Just do it! > o A way to specify a filter to a DOM-building-parser is another. > o Better integration with the DOM in general. The chief problem here is that SAX doesn't provide all the information that a DOM builder needs, notably the default value of attributes. > I'm sure others have their own feature list. If we can standardize feature control, then feature lists can be implemented in parsers or parser filters. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Mar 8 15:26:24 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:44 2004 Subject: URIs for features (was Re: SAX RFD: ModSAX Predefined Features) Message-ID: <01c501be6977$3a60ce00$0300000a@othniel.cygnus.uwa.edu.au> >...and RDF [1]. Giving interesting entities URIs makes them more >fully a part of the Web, and means we can take advantage of >URI-oriented metadata. Eg. you might search a software database for >resources that were of type 'Perl Module' and that implemented the >feature known as 'http://xml.org/sax/features/validation'. (There's >already a Linux Packages Database[2] along similar lines...). I'm not >claiming that this would be impossible using the Java naming scheme, >just that a Web oriented approach might make it easier to do certain >things... I wonder if this could be extended to more general features of XML software, not just SAX parsers. I wouldn't mind trying this out with XMLSOFTWARE.COM (http://www.xmlsoftware.com/). One of the problems that I have is with a canonical form of feature values for XML software like platform. URIs might provide just the solution. A Java 2 XSL processor conforming to the WD-xsl from 16th December 1998 might be specified in terms of http://java.sun.com/products/jdk/1.2/ and http://www.w3.org/TR/1998/WD-xsl-19981216 Actually, now that I think of it, we already have namespaces for content. They are called notations. There seems to be some link here. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Mon Mar 8 15:56:06 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:09:44 2004 Subject: Architectural Forms Questions References: Message-ID: <36E3F30C.F6D6DB51@mitre.org> Hi Folks, We have some beginner's questions on Architectural Forms. The motivation for this message is our interest in creation, discovery, sharing and reuse of mappings. - How powerful is the correspondence that you can express with Architectural Forms? Is it essentially limited to renaming and omission? - In addition to using Architectural Forms to express correspondences that are known a priori, could you use them to document mappings that are discovered "on-the-fly" by modifying a document or DTD after a mapping is discovered? - It appears to be the case that the correspondence between A and B must be documented in a way that keeps the mapping tightly coupled to either A or B. Are there any plans to represent the correspondence so that it is not tightly coupled to either A or B? - Is it a correct interpretation to say that Architectural Forms represent correspondence by overloading existing language constructs? - Given that subtyping and inheritance have been part of the primary XML "schema" proposals, is it likely that XML Architectural Forms will be overtaken by advances in the XML schema area? Thanks. /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Mon Mar 8 17:09:35 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:45 2004 Subject: Opinions requested In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au> References: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> Message-ID: <3.0.5.32.19990308090238.00c74c90@library.berkeley.edu> Thanks for the update on SIM. It's definitely more advanced in its development than I thought. A few additional comments, and a clarification: At 03:40 PM 3/6/1999 +1100, Marcelo Cantos wrote: >More important than any specifics, however, is the issue of what you >call a DBMS. To me, a DBMS is a database management system (seems >painfully obvious, but I think it bears repeating). You may argue >that a product is not a DBMS if it does not support feature X, and I >don't entirely disagree. When one talks of a DBMS one is conjuring up >a certain image in the mind of the listener, and that image may well >include feature X. To be fair to SIM, however, the essence of a DBMS >is that it manages a collection of data. If it doesn't support >transactions, this does not entail that it does not manage data. >Rather it simply has limits on the way the data is managed (i.e. it >doesn't manage data as well as one would like). > >You clearly believe that transaction support is part of the essence of >what makes a DBMS. I disagree, indeed, I profoundly disagree. There >is nothing in the concept of a database that mandates any such >requirement. Rather I would say that transaction support is an >important issue for any _good_ DBMS. Likewise for referential >integrity and concurrency (and, for that matter, support for >declarative queries, use of indexes, a rich set of fundamental data >types, etc.). If I recall correctly, dBase III was generally >acknowledged to be a DBMS though it lacked most of these requirements, >and could barely even call itself relational! I agree with all of the above, and I didn't mean to particularly single out transaction support. In addition to the point you raise that a DBMS calls to mind a particular set of features (not all of which need to be present to qualify a system as a DBMS), I'd add that particular systems are developed based on previous work within a particular paradigm (oh man, referencing Kuhn before I've even had coffee -- been a grad student too long) and I see SIM as much more following in the lineage of IR systems than DBMS systems. I'll grant there's overlap, and SIM is obviously moving towards a graceful integration of the two areas, but I'd characterize it as moving from an IR engine towards a combined IR/DBMS system. >I guess this all boils down to what's in a name. At the end of the >day, it is far more important to know what a product does and does not >do than what you call it. > Agreed, but as you mentioned, particular names invoke an understanding of what a system does/what features it may be expected to support, etc. While these understandings may overlap from one person to the next, often they don't, and I think DBMS are an example of an area where they can mean quite different things to different people. Hence, the frequency of people saying 'DBMS don't handle SGML/XML' occuring side by side with people saying 'what, are you crazy? Of course they do.' >I am sceptical that any RDBMS vendor can come to the party in terms of >performance. Past attempts to try to force text into a relational, >table or object based paradigm have not reaped great success (Oracle's >ConText comes to mind as an example of how forcing a square peg into a >round hole requires sacrificing the edges of performance). I would be >surprised if any of the major database vendors would be prepared to >venture away from their core competency (the relational model) to >address the performance issues. > I share your skepticism, but we can hope. If nothing else, there appears to be at least the dawnings of an understanding among the major DBMS vendors that there's a huge market for text management/retrieval products. Some of the approaches taken by the object-oriented database folks, like Informix's data blades, struck me as having promise. >I strongly disagree that SIM doesn't handle SGML/XML well. Ah, now here, I'm afraid you're reading words into my mouth. To clarify, I think SIM handles SGML/XML very well indeed; one of the best I've seen, in fact. I said I don't think any DBMS handles SGML/XML well, but I also excluded SIM from the DBMS category. Sorry, I should have been clearer about that. >From what you've said, though, SIM does appear to be shaping up as a very interesting IR/DBMS hybrid. The referential integrity hooks are a very nice plus. I have one piece of advice: promote yourselves more! :) I looked over the SIM web site before my post, and didn't see any discussion of the new features you're working on. A few words about future directions you're exploring for your product would be a good thing. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Mar 8 17:17:40 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:45 2004 Subject: Java Specification Request for XML In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com> Message-ID: At 12:42 PM -0800 3/3/99, David Brownell wrote: >The Java Community Process is an open, inclusive process and we >look forward to the active particpation of all interested parties. > The process, and its relatnive openness, is a little more obvious if you remove the passive voice. compare this: >The process goes forward in several steps: > >[1] The JSR is presented for comment (as you've seen) >[2] The JSR is approved (we hope) >[3] An expert group is formed to write the specification; this > begins with a "Call for Experts" (CAFE) to participate. >[4] The expert group writes a first draft of the specification >[5] The draft is circulated to all Java technology licensees and > Participants in the Java Community Process. >[6] Comments are collected, read, and responded to by the expert > group, resulting in an improved specification. >[7] The refined specification is then released to the public for > comment. >[8] Comments from the public are collected, read, and responded > to by the expert group, resulting in more refinements. >[9] The final specification is produced by the expert group, along > with a reference implementation and compatibility tests. > to this: [1] Sun presents the JSR for comment (as you've seen) [2] Sun's Process Management Office approves the JSR. [3] Sun forms an expert group to write the specification; this begins with a "Call for Experts" (CAFE) to participate. [Sun chooses the leader of the group, who then chooses the remainder of the experts.] [4] The expert group writes a first draft of the specification [5] Sun circulates the draft to all Java technology licensees and Participants in the Java Community Process. [that is, companies who have paid Sun thousands of dollars to do this] [6] The expert group collects, reads, and responds to comments, resulting in an improved specification. [7] Sun releases the refined specification to the public for comment. [8] The expert group collects, reads, and responds to comments, resulting in more refinements. [9] The expert group produces the final specification, along with a reference implementation and compatibility tests. >The key point is that everyone with internet access will get a >chance to review and comment on the emerging specification. > They can review and comment. There's no promise that anyone will even listen to their comments, much less act on them. There are a number of aspects of this "open" process that aren't mentioned here. 1. It costs between $2,000 (educational) and $5,000 (commercial) dollars to participate as an expert. 2. Sun owns the copyright and other intellectual property rights related to the spec. As owner, they will not allow derivative works they decide are incompatible. 3. Participants in the expert group can't talk about the ongoing work with outsiders. 4. Only company employees are allowed to be experts. Freelancers like many of those who participated in the development of SAX and XML are excluded. This is similar to W3C procedures, but the W3C allows exceptions for recognized experts. Sun does not. To me these alone make it pretty clear, that this process is open in name only. If you're still not convinced, ask yourself these questions: 1. Can anyone tell Sun No? Can anyone keep Sun from putting something into the spec they want to put it in? Or put something in that Sun wants to keep out? 2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process on an equal footing with Sun? Can they even participate at all? Bottom line: The openness of this process is PR, pure and simple. When you actually read the fine print, all Sun does is agree to let other companies contribute their time, money, and knowledge to help Sun do what it wants to do anyway. That may be intelligent business, but it's not an open, community based process for developing standards. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 19:52:05 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E3E712.D5556233@locke.ccil.org> References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> Message-ID: <14052.10627.837114.651600@localhost.localdomain> John Cowan writes: > David Megginson wrote: > > > public abstract void setFeature (String featureID, boolean state) > > throws SAXNotSupportedException; > > I want to propose a restriction and an extension: > > 1) This method cannot be called after any other parser method > has been invoked. Wouldn't it be better to allow the parser/filter make that decision? If the user attempts to change something during a parse that should *not* be changed during a parse, the parser/filter can throw a SAXNotSupportedException. > 2) This method is allowed to throw a SAXNewParserException, which > encapsulates a replacement parser. The application should use > the parser inside the exception in place of the original parser. > This allows parsers to push filters on top of themselves, which > complements the ability of applications to push them. I think that this could be layered on top of SAX, simply by subclassing SAXNotSupportedException. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 8 22:02:37 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> Message-ID: <36E44898.CB8E18C4@thinlink.com> David Megginson wrote: > As I wrote before, it doesn't much matter whether we use Java property > names incorporating domain names (like > 'org.xml.sax.features.validation') or URIs (like > 'http://xml.org/sax/features/validation'), as long as we have the > ability for people to create new names without fear of collision. I would also urge against using an http: URI since it is not meant that a resource actually be retrieved using the http protocol. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 22:12:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <14052.10627.837114.651600@localhost.localdomain> Message-ID: <36E44B40.4303A066@locke.ccil.org> David Megginson wrote: > Wouldn't it be better to allow the parser/filter make that decision? Yes. > > 2) This method is allowed to throw a SAXNewParserException, which > > encapsulates a replacement parser. The application should use > > the parser inside the exception in place of the original parser. > > This allows parsers to push filters on top of themselves, which > > complements the ability of applications to push them. > > I think that this could be layered on top of SAX, simply by > subclassing SAXNotSupportedException. Yes, but by making it part of the core SAX protocol for setting features, we guarantee universal support for it. A parser that knows itself to be naive about namespaces can load the NamespaceFilter and push it on top of itself, almost transparently to the application. Otherwise, every application that wants namespace support needs specialized knowledge about how to recover from SAXNotSupportedExn. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 22:27:50 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query Message-ID: Hi David, In a message dated 3/8/99 12:19:24 PM Eastern Standard Time, david@megginson.com writes: > Yes, but what about filters that perform specialised actions? And > what about adding support (stable or experimental) for new XML-related > features like schemas, datatyping, and linking as they become > available? You are absolutely right that extensibility is important. And, as you also stated, both naming schemes provide that ability. > As I wrote before, it doesn't much matter whether we use Java property > names incorporating domain names (like > 'org.xml.sax.features.validation') or URIs (like > 'http://xml.org/sax/features/validation'), as long as we have the > ability for people to create new names without fear of collision. Why do you need a domain name in there? I think one Parser/Filter implementor would be loathe to implement another companies feature name if it had sun.com or microsoft.com in it. That was the chief problem that developers had with Sun naming the Swing package com.sun.swing. I thought your features would have a single root tree like: sax.feature So that all features would be: sax.feature.whatever.myfeature as well as sax.props (for properties) Now, I understand the domain name being in there is a piggyback off of DNS. But, I still believe that functional features (of both Parser and Filters) are a finite domain -- whereas the web is not. That is why I don't see the correlation between this feature set and XML namespaces. If you agree that features and props are a finite domain (and in the whole scheme of things a rather small one), then a single naming tree should suffice. Also, Daniel Brickley mentioned a Java bias. I can understand his concern; heck, let's separate them with the delimiter of your choice (hyphens, underscore, etc.). While we are on the subject of bias: a URI has a resource/file system bias. To me, that bias was just confusing (and overkill) for something that I felt was best expressed with one word String constants (if you added the initial default set to the interface). Lastly, I would like to say that I do like your idea for the general property query and am glad you proposed it. The naming concerns I express here I deem as minor issues. Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 22:32:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <36E44898.CB8E18C4@thinlink.com> References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> Message-ID: <14052.19853.887104.987727@localhost.localdomain> Tom Harding writes: > David Megginson wrote: > > > As I wrote before, it doesn't much matter whether we use Java property > > names incorporating domain names (like > > 'org.xml.sax.features.validation') or URIs (like > > 'http://xml.org/sax/features/validation'), as long as we have the > > ability for people to create new names without fear of collision. > > I would also urge against using an http: URI since it is not meant > that a resource actually be retrieved using the http protocol. I've been thinking about this issue, and I'm fairly convinced that the URI is the right choice. Think of the URI a statement of ownership. Assume that my ISP is host.net, and that I've been allocated 5MB of web space at http://host.net/foo/. I am the only one who has the right to make a resource available at http://host.net/foo/, so I am the one who has the (moral) right to construct feature IDs based on http://host.net/foo/. It is not sufficient simply to use the domain name "host.net", because I don't own the domain (someone else could construct the same feature ID), and it is not sufficient to use something starting with net.host.foo, because I *don't* have the right to make something available at, say, ftp://host.net/foo/ -- host.net has made the foo available to me only through the HTTP protocol. Perhaps Foo enterprises has a download directory at ftp://host.net/foo/, and they might want to construct their own property ID based on it. Namespaces seems to have got it right. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 22:42:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: References: Message-ID: <14052.20696.226477.386853@localhost.localdomain> MikeDacon@aol.com writes: > Why do you need a domain name in there? I think one Parser/Filter > implementor would be loathe to implement another companies feature > name if it had sun.com or microsoft.com in it. That was the chief > problem that developers had with Sun naming the Swing package > com.sun.swing. A neutral .org domain usually provides a nice way around that problem. > Now, I understand the domain name being in there is a piggyback off > of DNS. But, I still believe that functional features (of both > Parser and Filters) are a finite domain -- whereas the web is not. > That is why I don't see the correlation between this feature set > and XML namespaces. If you agree that features and props are a > finite domain (and in the whole scheme of things a rather small > one), then a single naming tree should suffice. I expect the number of features to grow slowly, but I do not think that it is clearly bounded, especially not with all the XML-related work going on right now. A couple of years from now we could have data-typing, digital signing, and who knows what else. Furthermore, I do not want to have to set up my own registration authority, and I do not want developers to have to wait for anyone to approve their feature names before they can ship. Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 22:58:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:45 2004 Subject: Naming ModSAX features: good use for the 'java:' URI scheme? In-Reply-To: <36E44898.CB8E18C4@thinlink.com> Message-ID: On Mon, 8 Mar 1999, Tom Harding wrote: > David Megginson wrote: > > > As I wrote before, it doesn't much matter whether we use Java property > > names incorporating domain names (like > > 'org.xml.sax.features.validation') or URIs (like > > 'http://xml.org/sax/features/validation'), as long as we have the > > ability for people to create new names without fear of collision. > > I would also urge against using an http: URI since it is not meant that a resource actually be > retrieved using the http protocol. I think I've found a compromise of sorts that'll let us use the Java naming scheme (for those uncomfortable with naming conceptual entities in the http namespace), whilst still using URIs. >From http://www.w3.org/Addressing/schemes.html Addressing Schemes This is (an attempt at) an exhaustive list of URI schemes. I try to list them all, whether they're standard or not. Under 'J' we find a useful looking entry... java: identifies java classes (@@spec?) javascript: There's also a reference to a JavaRMI: URI schema invented by Bill Jansen, which would be interesting to track down. But anyway... So... here's the proposal: Naming ModSAX Features ModSAX is intended to be easily extensible, and is designed to anticipate future independently developed extensions ('features'). For ModSAX-aware software to cope with the decentralised evolution of new features, it is important to have a controlled mechanism for naming these features unambiguously. For this we adopt the Uniform Resource Identifier (URI) system defined in RFC 2396[URI]. Each (version of a) ModSAX feature should be assigned a unique URI. It should not be assumed that these identifiers can always be dereferenced to acquire further information about the feature they name. For example, the 'http:' scheme and 'java:' schemes can be used. 'http://purl.org/net/sax/MyFeature' and 'java:org.desire.sax.MyFeature' are both legitimate names for SAX features. 'phone:+44-117-9287493' would not be an appropriate name, since the 'phone:' URI namespace can only be used for telephone numbers. This way, people who manage http: URI names and want to use them to name SAX features are free to do so. Others can piggyback on the DNS via the java: scheme instead. But both through the same overarching approach. So... It would be nice to have a reference to some spec defining the 'java:' URI scheme mentioned at http://www.w3.org/Addressing/schemes.html Maybe somebody from Sun has a pointer to this...? BTW as a side effect of having a URI scheme for Java classes and intefaces, we can exchange (aggregate, search, reason over) RDF metadata about those resources. This would be handy in Sun's JINI amongst other places.... Here's a quick and dull example of metadata keyed off a java: URI... Dan Brickley and Larry Franklin This applet is an attempt at a metadata browsing tree control But I'm sidetracking again. I'm really just saying one thing: the existence of a URI schema for Java classes (and packages) means we don't need to choose between Java and URI naming formalisms. We can have the best of both worlds... Dan [URI] Uniform Resource Identifiers (URI): Generic Syntax; Berners-Lee, Fielding, Masinter, Internet Draft Standard August, 1998; RFC2396. http://www.isi.edu/in-notes/rfc2396.txt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 23:01:23 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: References: Message-ID: * David Megginson | | - org.xml.sax.features.validation is more of a Java flavour. * Dan Brickley | | Yep... but might not feel so natural for developers working with | versions of SAX translated for Perl, Python and so on. I'll be translating this into Python and I see absolutely no problems with this from that point of view. It's a natural way to use the DNS as a basis for a naming system and Java just happens to use it. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 23:03:34 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <8246f301.36e4560e@aol.com> Hi David, In a message dated 3/8/99 5:38:55 PM Eastern Standard Time, david@megginson.com writes: > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > This is the primary reason I disagree with using a URI. A feature is not a resource. Also, a standard interface to a set of features is not the place to invoke ownership priviledges. You can't own a feature that you expect others to implement. Unless I am not getting your idea of a feature, your logic seems incorrect. Interesting discussion and process (well worth it), - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 23:09:36 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <14052.19853.887104.987727@localhost.localdomain> Message-ID: On Mon, 8 Mar 1999, David Megginson wrote: > Tom Harding writes: > > David Megginson wrote: > > > > > As I wrote before, it doesn't much matter whether we use Java property > > > names incorporating domain names (like > > > 'org.xml.sax.features.validation') or URIs (like > > > 'http://xml.org/sax/features/validation'), as long as we have the > > > ability for people to create new names without fear of collision. > > > > I would also urge against using an http: URI since it is not meant > > that a resource actually be retrieved using the http protocol. > > I've been thinking about this issue, and I'm fairly convinced that the > URI is the right choice. > > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > [...] Just to head off one possible objection... that of the persistence (or lack of) w.r.t. http URLs. The PURL folks (Persistent URLs) make a credible case when they argue that URLs can be managed just a responsibly as URNs, and that persistence of http naming is a social issue not a technical one. PURL servers are available to help here -- eg XML-DEV's own XSchema (now DDML) pages have been available from several different http servers, but have always had the same URI: http://purl.oclc.org/NET/xschema The PURL server at that address sends an HTTP redirect messge if you try to derefence it. So we could for eg use PURLs to name software features, with reassurance that PURL.ORG have committed to do their best to manage http://purl.org/* names responsibly. > Namespaces seems to have got it right. Yep. Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 23:13:49 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <01ab01be69b8$39f1cf00$c9a8a8c0@thing2> From: John Cowan >> > 2) This method is allowed to throw a SAXNewParserException, which >> > encapsulates a replacement parser. The application should use >> > the parser inside the exception in place of the original parser. >> > This allows parsers to push filters on top of themselves, which >> > complements the ability of applications to push them. >> >> I think that this could be layered on top of SAX, simply by >> subclassing SAXNotSupportedException. > >Yes, but by making it part of the core SAX protocol for setting >features, we guarantee universal support for it. A parser that knows >itself to be naive about namespaces can load the NamespaceFilter and >push it on top of itself, almost transparently to the application. >Otherwise, every application that wants namespace support needs >specialized knowledge about how to recover from SAXNotSupportedExn. There are really three approaches here: 1. An application pushes a filter "on top of" a parser. In this case, the application starts with a parser and chooses to augment it with a filter. 2. The application requests a feature of the parser and the parser elects to wrap itself in a filter. For efficiency reasons(?), it asks the application to now use the filter in place of itself. 3. An application works with a pseudo-parser. It asks for various features and the pseudo-parser selects a parser and a set of filters which together can deliver the requested capabilities. I do like David's proposal--its pretty open ended. The method get(infoID) will even serves as a front-end for aggregation! But I see a problem in trying to go too far on the feature selection path. The assumption seems to be that we are dealing here with a completely orthogonal set of features which are just selected or not as needed. There is no sense of structure or architecture here. I'm not sure that this is a useful model. Frankly, I much prefer Simon's layered approach: http://www.simonstl.com/articles/layering/layered.htm Again, I'm happy with the interface, but this idea of creating filter structures based on feature selection seems a bit lame. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 23:18:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <8246f301.36e4560e@aol.com> Message-ID: On Mon, 8 Mar 1999 MikeDacon@aol.com wrote: > Hi David, > > In a message dated 3/8/99 5:38:55 PM Eastern Standard Time, > david@megginson.com writes: > > Think of the URI a statement of ownership. Assume that my ISP is > > host.net, and that I've been allocated 5MB of web space at > > http://host.net/foo/. > > > > This is the primary reason I disagree with using a URI. > A feature is not a resource. Software features aren't files, nor are they HTML pages, but they are 'resources' as defined in RFC2396 and as used in the XML Namespaces and RDF recommendations from W3C. I'm getting *really* boring on this topic... ;-) >From RFC2396 (online at http://www.isi.edu/in-notes/rfc2396.txt) A Uniform Resource Identifier (URI) is a compact string of characters for identifying an abstract or physical resource. [...] Resource A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. > Also, a standard interface to a set > of features is not the place to invoke ownership priviledges. You can own (or manage) the name for the feature though. Javasoft own all the URIs beginning 'java:java.lang.*'; I own the URIs beginning 'java:org.desire.rudolf.rdf.*'. These can name classes or interfaces others might implement. Dan > You can't own a feature that you expect others to implement. > > Unless I am not getting your idea of a feature, your logic seems > incorrect. > > Interesting discussion and process (well worth it), > > - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 00:19:02 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <021a01be69c1$a49538c0$c9a8a8c0@thing2> From: David Megginson >I expect the number of features to grow slowly I suspect otherwise. Especially since the interface would also be used by filters and DOMWlakers. Think of the get and set methods as ways of accessing the properties on filters which are part of some larger filter structure (a stack being the simplest case). In addition to parse events moving from parser-kernel to application via a series of filters and event routers, the get and set "events" move from the application through the filters and down to the parser-kernel. Think of the parser and the filters together as a large aggregate of components. The get, set, setFeature, and setHandler may well be intercepted by any component in that aggregate which recognizes the featureID, handlerID, or infoID. I see the ModParser interface as currently defined as being very important for filters, with the number of featureIDs growing with the popularity of such filiters. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Mar 9 00:34:35 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:46 2004 Subject: Opinions requested Message-ID: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca> At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote: >I share your skepticism, but we can hope. If nothing else, there appears >to be at least the dawnings of an understanding among the major DBMS >vendors that there's a huge market for text management/retrieval products. >Some of the approaches taken by the object-oriented database folks, like >Informix's data blades, struck me as having promise. There's the rub. *Is* there really a huge market for text management/retrieval? The history of software is littered with the corpses of companies who tried to make a go of it in that area; I know from personal experience that up to and through the year 1996, there was *not* any such huge market. Will XML change that? It would be nice to think so. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Tue Mar 9 00:52:11 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:46 2004 Subject: Namespaces and DTDs Message-ID: <36E49A4D.413D71F3@metalab.unc.edu> Situation: I have several DTDs with conflicting definitions of certain elements. (e.g one defines a HEAD as a TITLE followed by a META and another defines a HEAD as #PCDATA). I need to use all the DTDs and associated markup languages for a single document. To an extent I can disambiguate them with namespaces. However, is there any way I can do this while still validating against the orignal DTDs? That is without rewriting the DTDs to use the qualified names instead of the orignal names that are in the DTDs? I've been trying to work with default values for xmlns attributes, and the like; but that doesn't seem to get me quite all the way to where I need to go. Am I going to have to break down and just rewrite the DTDs to use the qualified names? -- Elliotte Rusty Harold xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dent at oofile.com.au Tue Mar 9 02:03:11 1999 From: dent at oofile.com.au (Andy Dent) Date: Mon Jun 7 17:09:46 2004 Subject: Expat API In-Reply-To: References: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> Message-ID: >My question is: where is the documentation on how to use the expat >API? I downloaded version 1.0.2 and ported the code to run the sample >program on my Macintosh, but I'm pretty much dead in the water. I >tried sending email to the author (James Clark) twice in the last few >days, but I have so far failed to receive a response. The comments in >the header files do not seem to be sufficient. Dave We have a c++ wrapper on expat running under CodeWarrior as part of a much bigger project to make our report writer interchange data with XML. You're welcome to a copy. It makes the expat API a LOT easier to use if you are a c++ programmer as it presents a virtual method interface to expat - you inherit from our object and override the methods (eg: startElement) that you want to use. When it's a bit more cleaned up with better samples I'll be submitting it back to James. Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows PP2MFC - PowerPlant->MFC portability http://www.highway1.com.au/adsoftware/crossplatform.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Tue Mar 9 05:31:02 1999 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:09:46 2004 Subject: Opinions requested In-Reply-To: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca> Message-ID: At 4:37 PM -0800 3/8/1999, Tim Bray wrote: > At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote: >>I share your skepticism, but we can hope. If nothing else, there appears >>to be at least the dawnings of an understanding among the major DBMS >>vendors that there's a huge market for text management/retrieval products. >>Some of the approaches taken by the object-oriented database folks, like >>Informix's data blades, struck me as having promise. > > There's the rub. *Is* there really a huge market for text > management/retrieval? The history of software is littered with the > corpses of companies who tried to make a go of it in that area; I > know from personal experience that up to and through the year 1996, > there was *not* any such huge market. Will XML change that? It > would be nice to think so. -Tim The Web has certainly raised the profile for text retrieval, and the amount of text online is larger than its ever been. A lot of text-management turns out to be going on in relational databases, and those are pretty big business. But the large content-management companies -- Verity, Open Text, Fulcrum (bought by PCDOCS recently bought by someone else) -- seem to be going through wild stock price variations recently. I've no idea what the future market will be: I find it all mystifying! BTW, Lisa Rein has written a report on the Query Language '98 workshop at W3C last year: http://www.xml.com/xml/pub/1999/03/quest/index.html It looks quite comprehensive to me, and all the position papers indicate that the topic is a hot one. Avi ________________________________________________________________ Avi Rappoport, Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 05:36:44 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: ModSax Suggestion References: Message-ID: <36E4B1FA.E482164@eng.sun.com> > > Interesting suggestion for a big hole in the parts of > > the Java API set that are more or less "standard" at > > this poit -- SAX and DOM. > > > > One comment though: I've found that it's important to > > be able to have options controlling how the DOM tree is > > built. For example, whether to discard ignorable spaces, > > or do namespace conformance enforcement, or try to get > > CDATA sections (comments, etc). > > > > I agree with that. I think all that is possible while still retaining > a minimalist design philosophy. [deletia] > > That way via an extensible common set of text properties we > can add properties as the need arises without expanding the API. I've always liked the idea of filters in the SAX event chain. As Bill la Forge (and you) noted, that's a fine way to address that general issue. One can overdo layers, of course, and pay for it in performance. But filters are a good architectural notion, and there's been lots of discussion about how to use them well with SAX and DOM. That does imply keeping DOM out of the basic parser API, which I still think is the best way to go. An event generator (say, a SAX parser, or something walking a DOM tree) can have its events filtered, and delivered to acomponent building a DOM tree. > Looking forward to progress on the Java XML API. BTW, Dave, > are you going to do a "Birds of a Feather" session on XML at this years > JavaOne? I think that could be valuable. I may be signed up for more than that this time... A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. Some of the folk here have never met in person. I think there will be lots of interesting applications to talk about ... and probably some interesting frameworks. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 05:55:30 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query References: <14051.46670.687235.664451@localhost.localdomain> Message-ID: <36E4B666.411E15C2@eng.sun.com> OK, I'll pick this thread rather than the longer one to read first... XML-DEV really generates lots of traffic lately!! - I agree re using URIs, like Namespaces do. Anyone can get a URI nowadays, for virtually no cost, but that's not true of reversed domain names (as used in Java properties and package names). - There will need to be some strong policies for how the "things" to which an {info,handler,feature}ID map are documented. I think that leadership by example can play a strong role here ... :-) Related point, that policy should specify the status of the "thing". For example, "stable", "beta", "experimental", "private", to pick an order where folk should be progressively less willing to use or implement the "thing" in a parser. - I'd like a "getHandler" API ... or perhaps, eliminate the notion of 'feature' and 'handler' IDs and just use "infoID" values that map to the appropriate handdlers. I've found it important to be able to do things like, say, "use the error handler everyone else is using". (Where's getFeature? One can return a Boolean from a "get" ...) Re that last point, I might have missed some e-mail and will try to catch up. It's not clear why there's a need for more than a single general get/set API for this. - Dave David Megginson wrote: > > What: Additions to ModParser interface > > I'm proposing a couple of additions to the ModParser interface: > > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > > public abstract Object get (String infoID) > throws SAXNotSupportedException; > } > > These allow you to do interesting things like > > parser.set("http://www.foo.com/props/textfilter", filter); > > or > > try { > Node node = parser.get("http://xml.org/sax/props/dom-node"); > } catch (SAXNotRecognizedException e1) { > // doesn't know about DOM processing... > } catch (SAXNotSupportedException e2) { > // knows about DOM processing, but not doing it... > } > > Again, it's a little sloppy as an interface, but it's beautifully > extensible and it supports filters nicely (if there are other filters > between the DOM iterator and the application, it will still work). > > Note that strictly speaking, now, setHandler() and setFeature() are no > longer primitives, since they could both be implemented in terms of > set(), but I think that the extra type checking is worthwhile in those > cases. > > All the best, > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 06:27:39 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX RFD: ModSAX Predefined Features References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <36E4BDC9.DB06F185@eng.sun.com> Lars Marius Garshol wrote: > > * Bill la Forge > > | So that's why I'm butting in here. I think an open standards process > | is important for individuals and small companies. We need to do what > | we can to keep the ball rolling here. > > We are certainly in heartfelt agreement here. :) Gee, as a wage-slave working for a big company, I hope that I'm not _too_ excluded from the discussions ... :-) Seriously: my personal model is a lot more akin to the original IETF style "running code and working consensus" model than most existing standards bodies. I'm a lot happier with standards that come from such a process than from ones that involve fat specs that can't be implemented. Writing code is generally more fun than specs -- though an elegant spec is also a work of art! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 06:57:28 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E4C4E6.B51DDFF3@eng.sun.com> Again, I think that unifying these under the generic get/set API (with Boolean.TRUE and Boolean.FALSE objects as values for features that are really boolean) could be useful. Documentation for each feature should specify whether it's changeable mid-parse ... I'd suggest "no" as the default answer! Mike Dacon commented about the "API archaeology" aspect of this name; perhaps the "Parser2" style naming convention can avoid losing technical context (i.e. this is still a parser, even if it's parsing a DOM or a stream of SAX events :-). > 1. http://xml.org/sax/features/validation Good. (I'm curious if folks prefer one parser, which can have this feature toggled, vs two, where the parser comes with at least an initial value.) > 2A. http://xml.org/sax/features/external-general-entities > 2B. http://xml.org/sax/features/external-parameter-entities Right, two kinds of parsed entities, two control knobs. Validating parsers must refuse to change these knobs. (OK, _five_ kind of parser -- validating, and four kinds of nonvalidating parser! ;-) > 3. http://xml.org/sax/features/namespaces I'd rather have this just kick in modified XML syntax rules (e.g. entity names may never be scoped, and scoped names may have only one interior colon). With that, one can layer the rest of namespace processing on top in any of several fashions. A DOM can be built which exposes namespace declarations; or a filter can munge names and strip out the declarations. The "munge" feature could get its own namespace URI. > 4. http://xml.org/sax/features/unbuffered-input > True means ensure that the parser does not buffer input from a > Reader or InputStream supplied by the application (actually, > one-character look-ahead will usually be required); false means do > not ensure that the parser does not buffer input. This feature might > be useful for reading multiple documents from a single stream. I'm not sure this is a common enough feature to need to be predefined ... support for "XML Islands" within HTML may become important, but much of this can be done (at least in Java) by requiring pushback to be done at appropriate points. > http://xml.org/sax/features/normalize-text This is a good filter feature, I think. Lars suggested a "Catalog" feature. There are different sorts of catalog, and they need configuration, so the value of this could be a URI for the catalog, not just a boolean. Plus, this would seem to be up to the "EntityResolver" to handle ... yes? It'd perhaps suggest that one could ask the next filter in the stream for the resolver it was using ... :-) Good discussion, gang! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Tue Mar 9 08:58:11 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:46 2004 Subject: version within XML Message-ID: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK> Hi all, I am seeking info on versioning XML documents. I have seen it done in a few different ways. Specifically what are the issues to ensure backward comparability between versions. Any help is appreciated. adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 9 09:43:35 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:46 2004 Subject: Namespaces and DTDs Message-ID: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Elliotte Rusty Harold wrote: > I have several DTDs with conflicting definitions of certain elements. > (e.g one defines a HEAD as a TITLE followed by a META and another > defines a HEAD as #PCDATA). I need to use all the DTDs and associated > markup languages for a single document. > > To an extent I can disambiguate them with namespaces. However, is there > any way I can do this while still validating against the orignal DTDs? > That is without rewriting the DTDs to use the qualified names instead of > the orignal names that are in the DTDs? I've been trying to work with > default values for xmlns attributes, and the like; but that doesn't seem > to get me quite all the way to where I need to go. Am I going to have to > break down and just rewrite the DTDs to use the qualified names? If you want to use a namespace-unaware parser, I don't see how you can avoid rewriting the DTDs. Unless the names in the DTDs are qualified, you will have two elements with the same name (e.g. "HEAD"), which is a validation error. And even assuming that this isn't immediately flagged, I can see no way for a namespace-unaware parser to figure out which content model to validate against when it encounters one of the duplicated element names: If prefixes are used, the name won't match any of the DTD names; if prefixes are not used (due to use of defaults), the name will match multiple DTD names. Note that this problem is not limited just to validation. At the very least, it applies to retrieving default attribute values as well. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Mar 9 10:00:50 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk> > I've been thinking about this issue, and I'm fairly convinced > that the URI is the right choice. > > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > I don't often disagree with David, but I think this is quite misguided. If we're only after a unique identifier we could use the longitude and latitude of the house where I live. In fact that would be better, because it identifies a unique place, whereas the "http:" idea also says you can get there by bus and the buses are run by the host.net bus company: in fact it invites you to "click here" to jump on the bus. But if you get on the bus and ask for the destination the driver will tell you "Never heard of it, guv." And of course it ignores the fact that you can have two buses going to the same place from different directions. Just because Namespaces made this mistake (and confused all newbies by doing so) doesn't mean we have to as well. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Tue Mar 9 10:25:49 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk> Message-ID: On Tue, 9 Mar 1999, Kay Michael wrote: > > I've been thinking about this issue, and I'm fairly convinced > > that the URI is the right choice. > > > > Think of the URI a statement of ownership. Assume that my ISP is > > host.net, and that I've been allocated 5MB of web space at > > http://host.net/foo/. > > > I don't often disagree with David, but I think this is quite misguided. > > If we're only after a unique identifier we could use the longitude and > latitude of the house where I live. Great. Why not propose a URI scheme for it? (although this would also confuse people as a place is something you'd look up on a map, not a software feature.) In fact that would be better, because it > identifies a unique place, whereas the "http:" idea also says you can get > there by bus and the buses are run by the host.net bus company: in fact it > invites you to "click here" to jump on the bus. But if you get on the bus > and ask for the destination the driver will tell you "Never heard of it, > guv." > > And of course it ignores the fact that you can have two buses going to the > same place from different directions. The URI spec very clearly does not ignore this point. >From RFC 2396 again... (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt) 1.2. URI, URL, and URN A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. [...] Although many URL schemes are named after protocols, this does not imply that the only way to access the URL's resource is via the named protocol. Gateways, proxies, caches, and name resolution services might be used to access some resources, independent of the protocol of their origin, and the resolution of some URL may require the use of more than one protocol (e.g., both DNS and HTTP are typically used to access an "http" URL's resource when it can't be found in a local cache). > Just because Namespaces made this mistake (and confused all newbies by doing > so) doesn't mean we have to as well. Making the same mistake as the rest of the world has its benefits though: if we use URIs for ModSAX features, we get for free any progress on better naming infrastructure (URNs, metadata, resolution infrastructure layered over the Web caching network etc). If we invent another a nameless, specless naming system, we're on our own. Dan -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9288478 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 10:43:27 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Kay Michael To: XML Developers' List Sent: 09 March 1999 09:54 Subject: RE: SAX: ModSAX addition, general property query >> I've been thinking about this issue, and I'm fairly convinced >> that the URI is the right choice. >> >> Think of the URI a statement of ownership. Assume that my ISP is >> host.net, and that I've been allocated 5MB of web space at >> http://host.net/foo/. >> >I don't often disagree with David, but I think this is quite misguided. I agree - I don't actually see the benefit of using a string identifier at all: I don't think that it's unreasonable to insist that objects representing a Feature, Handler or Property should either implement a distinct interface or subclass a distinct class. If this is so the Parser can tell what Feature, Handler or Property is being set by enquiring of the type of the object. (I favour insisting that they subclass distinct classes because (in Java) that naturally imposes the restriction that a single object can only represent a single Property.) The get() member function could take a Class parameter. The advantage of this approach is that it relies only on the type naming scheme of Java and there are already well established mechanisms that ensures that different implementers create distinct types. I am by no means an expert in the other languages that are supported by SAX - would this approach cause dreadful problems in other languages? John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 11:09:31 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML References: Message-ID: <36E4FF9B.8CA35A47@eng.sun.com> Elliotte Rusty Harold wrote: > > >The Java Community Process is an open, inclusive process and we > >look forward to the active particpation of all interested parties. > > The process, and its relatnive openness, is a little more obvious if you > remove the passive voice. compare this: When you change it to what you wrote, it is no longer correct. Some key points: - No, Sun doesn't need to submit all JSRs. Any Participant can do so. We did for this one, to help jumpstart the process; many people want to see a Java Platform API for XML. - Yes, Sun's Program Management Office (vs. say Ken Starr) approves or rejects submitted JSRs. - No, the leader of the expert group doesn't need to be from Sun. The group formed by that leader, from the pool of volunteer experts and from external invited experts, is supposed to be a diverse cross section. This is auditable. - Re cost to be a "Participant", I had the same comment. The fee can be waived for invited experts. And note that the fee is less than an expert's time will cost -- much less! Sun is working with this process in good faith, though you seem to fear otherwise. Re other processes ... I don't think anyone's quite figured out how to make the "open source" processes drive established software companies. Like many leading companies, Sun is taking steps in that direction. But at least for this year, that isn't a useful class of processes to measure against. > >The key point is that everyone with internet access will get a > >chance to review and comment on the emerging specification. > > They can review and comment. There's no promise that > anyone will even listen to their comments, much less act on them. No, there _is_ a promise they'll be listened to; and I understand the action will at least include a response. Have you ever participated in the comment process for an IEEE spec? One submits comments, and gets formal responses. (I seem to recall it being restricted to paid-up IEEE members though.) That's the model to keep in mind -- not the "black hole" model you've described. Again, this is auditable. > There are a number of aspects of this "open" process that aren't mentioned > here. Paraphrasing points I didn't mention above: - Copyright and other Intellectual Property Rights. Hmm, wouldn't you just hate to base a product on a specification, and then find that you've got to fork over $5K/copy to use it? Have a look at what any of the "Open Source" license agreements (e.g. MPL2) say about such issues. - Derivative works. Nobody wins if people are allowed to ship things as "compatible" that really aren't; that's what the compatibility test suite is there to help ensure: "Write Once, Run Anywhere" does not come without effort, and it's a Big Deal. - Pillow talk. It's supposed to be private. - Of course non-corporate experts exist; always have, always will. And they can participate too. > To me these alone make it pretty clear, that this process is open in name > only. If you're still not convinced, ask yourself these questions: > > 1. Can anyone tell Sun No? Can anyone keep Sun from putting something into > the spec they want to put it in? Or put something in that Sun wants to keep > out? If the Expert Group disagrees with Sun's representative, that could happen. I'd hope it wouldn't -- but it could happen. > 2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process > on an equal footing with Sun? Can they even participate at all? Can those companies participate? Absolutely. Though I don't think that they've wanted to do so -- going purely by what the press has been seen to report. > Bottom line: The openness of this process is PR, pure and simple. So is that glass half full, or half empty? :-) "Openness" fits on a spectrum. I think that this process compares favorably with most other standards processes I've seen. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Andy.Bradbury at syntegra.bt.co.uk Tue Mar 9 11:18:45 1999 From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk) Date: Mon Jun 7 17:09:47 2004 Subject: X for eXtensible DBMS? Message-ID: <65AF45D5E535D2118AFB0008C7FA23180C3D08@FL-EXCHANGE-03> The only IMS I ever came across was hardly what I'd call 'extensible' - not unless you actually *like* taking a whole database down in order to create or modify a single extra link ;,) Regards Andy B. -----Original Message----- From: Smith, Adrian [mailto:asmith@drumbeat.com] Sent: 05 March 1999 17:19 To: 'Jeffrey E. Sussna'; 'Chad Adams'; xml-dev@ic.ac.uk Subject: RE: Opinions requested There actually is an XDBMS. It predates XML. This dates back to around 1965/1966. The database created was titled "IMS" for Information Management System, it was created by IBM and used an hierarchical model for the data. It had all the same characterstics of XML with almost the exact same set of constructs and shortcomings. Thanks! Adrian xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:00:06 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion Message-ID: <005001be6a23$7e574240$c9a8a8c0@thing2> From: David Brownell >> Looking forward to progress on the Java XML API. BTW, Dave, >> are you going to do a "Birds of a Feather" session on XML at this years >> JavaOne? I think that could be valuable. > >I may be signed up for more than that this time... > >A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. >Some of the folk here have never met in person. I think >there will be lots of interesting applications to talk >about ... and probably some interesting frameworks. Simon and I proposed a Coins BOF some time back and JavaOne accepted it. Might be a good place to meet and discuss ModSAX, filters, and such. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:16:09 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion Message-ID: <006301be6a25$c42a6700$c9a8a8c0@thing2> From: David Brownell >I've always liked the idea of filters in the SAX event chain. >As Bill la Forge (and you) noted, that's a fine way to address that >general issue. One can overdo layers, of course, and pay for it >in performance. But filters are a good architectural notion, and >there's been lots of discussion about how to use them well with >SAX and DOM. > >That does imply keeping DOM out of the basic parser API, which >I still think is the best way to go. An event generator (say, >a SAX parser, or something walking a DOM tree) can have its >events filtered, and delivered to acomponent building a DOM tree. A filter can itself hold a stack of other filters, or even a set of filters to which events are routed based on some pattern. Being able to place just one filter in front of the DOM built by the parser is all you really need. Using the ModParser interface, can do the following: 1. Use setFeature to turn on DOM construction. 2. Use set to insert a filter in front of the DOM. 3. Parse a document. 4. Use get to retrieve the constructed DOM. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:45:44 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <007801be6a29$d6efdd80$c9a8a8c0@thing2> From: John Wilson >I don't think that it's unreasonable to insist that objects representing a >Feature, Handler or Property should either implement a distinct interface or >subclass a distinct class. If this is so the Parser can tell what Feature, >Handler or Property is being set by enquiring of the type of the object. (I >favour insisting that they subclass distinct classes because (in Java) that >naturally imposes the restriction that a single object can only represent a >single Property.) Filters often implement more than one (generally all) handler interface and then register themselves with the underlying parser/filter for the same events requested by the overlaying application/filter. Your proposal would require the filter to instantiate seperate objects for each set of events it needs to process, though it could simply pass-through the handlers for those it does not. The role and class of an object are often distinct. This was one of the things I did not like about the aggregation scheme that was proposed by Sun a while back. I think David got it right. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 13:04:46 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <08b501be6a2d$3b34c820$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Bill la Forge To: John Wilson ; XML Developers' List Sent: 09 March 1999 12:39 Subject: Re: SAX: ModSAX addition, general property query >From: John Wilson >>I don't think that it's unreasonable to insist that objects representing a >>Feature, Handler or Property should either implement a distinct interface or >>subclass a distinct class. If this is so the Parser can tell what Feature, >>Handler or Property is being set by enquiring of the type of the object. (I >>favour insisting that they subclass distinct classes because (in Java) that >>naturally imposes the restriction that a single object can only represent a >>single Property.) > > >Filters often implement more than one (generally all) handler interface and >then register themselves with the underlying parser/filter for the same events >requested by the overlaying application/filter. > >Your proposal would require the filter to instantiate seperate objects for each >set of events it needs to process, though it could simply pass-through the handlers >for those it does not. Certainly you need to instantiate an object per handler, however it need not be too ugly public class MyFilter { public final DTDHandler dtdHandler = new DTDHandler() { ... }; public final DocumentHandler documentHandler = new DocumentHandler() { ... }; .... } would seem to me to be a reasonable way of dealing with this. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 13:15:17 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML Message-ID: <008b01be6a2e$0592cfe0$c9a8a8c0@thing2> From: David Brownell >Re other processes ... I don't think anyone's quite figured >out how to make the "open source" processes drive established >software companies. Like many leading companies, Sun is >taking steps in that direction. But at least for this year, >that isn't a useful class of processes to measure against. I suspect that a change to Open Source Software will depend on more than just vendors. Vendors need to be responsive to their customers, many of whom are still not with the new program. I don't think this process can be driven entirely from the top. It would be risky for a vendor to get to far ahead of its "community". So while open forums like XML-DEV are closer to the ideal, given the opportunity, I will be glad to participate in Sun's own process. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Mar 9 14:39:06 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:47 2004 Subject: Namespaces and DTDs References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Message-ID: <36E5322A.7DDADDD8@goon.stg.brown.edu> Ronald Bourret wrote: > > I have several DTDs with conflicting definitions of certain elements. > > ...Am I going to have to break down and just rewrite the DTDs to use > > the qualified names? > > If you want to use a namespace-unaware parser, I don't see how you can > avoid rewriting the DTDs. Maybe I misunderstand, but as far as I can see, namespaces won't help you, either. Why? Because even if you can refer to, say, your two TITLE elements by different prefixes, you'll still have to declare the prefixed elements in the DTD as if they were atomic element names. Namespaces, in other words, don't solve your problem. They may make it worse, in fact, because you have to know what prefixes you are going to declare in a given document to be able to rewrite your DTD to work with that document. There was a furor two or three months ago on this list about namespaces breaking validation. That furor died down when the namespace spec became an official recommendation (a done deal, in other words). Just so you know, though: The issue you raise is just the sort of thing that caused the furor. People were expecting namespaces to help in just your situation. When they found out that namespaces didn't help, many were disappointed, and said so. The most effective responses I saw were from people who said, in effect, "Namespaces do far less than you want or expect them to." The question is my mind is whether they actually get in the way. (You won't hear any gripes from me if my take on namespaces turns out to be dead wrong.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Mar 9 14:55:56 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML In-Reply-To: <36E4FF9B.8CA35A47@eng.sun.com> References: Message-ID: <4.0.1.19990309092029.00f0f4b0@pop.hesketh.net> David Brownell wrote: >> >The Java Community Process is an open, inclusive process and we >> >look forward to the active particpation of all interested parties. If I just had to take _your_ word for it, David, I'd definitely believe it. Your continued participation on these lists and your contributions to projects like SAX and ModSAX clearly indicate that you, at least, have an open mind when it comes to open source/open process models. Unfortunately, when I visit Sun's site, and read the documentation surrounding the JCP, I'm decidedly unconvinced. Elliotte may have put Sun too deeply in the process in his description, but there's no getting around the pay to play principle that is deeply enshrined in this so-called open process. I'm glad to hear you say that it can be waived for the expert group, though it certainly wasn't clear from the Web site. (It looks like it can be waived for the first year only.) If Sun's approach involved only royalties-after-a-product-ships, I'd be a lot quieter. (I don't, after all, charge for the software I produce.) It's not, though. There are upfront fees ($5000 for non-educational entities, $2000 for non-profit or educational. (See http://developer.java.sun.com/developer/jcp/java_community_process.html for details. Most of the kickers are in the agreement, http://developer.java.sun.com/developer/jcp/JSPA.pdf) The JCP may feel like an 'open' process if you're a mammoth, or even if you're a reasonably well-off sabre-toothed tiger, but to us small mammals, it's the same old s***, different day, that we get from standards organizations. We get to run around among the mammoths and sabre-toothed tigers wearing funny lenses that blur our vision and working with tools that may not have been created with our needs in mind. The price of _joining_ the process (as a partner, where it appears you do have more influence) is even more irritating because Sun is, after all, a vendor. If I really wanted to give Sun Microsystems a sizable check, I'd expect at least a Sparc 5 with a huge monitor to show up in return. Giving Sun $5000 so this poor company can manage a not-so-open process ('Process Cost Sharing') is ridiculous. Given that $5000 pays all my expenses for a few months, the cost to small business and self-employed folks is outrageous. I'd love to participate in the process as a 'full' member, contributing time (which costs me something too), the standard currency for open source and open process participation, rather than a large sum of money that goes nowhere. I'll participate - as much as I'm allowed - but remember that the JCP is _far_ less open than the current ModSAX discussion, and I think the results of the JSR for XML are going to suffer as a result. Enough of the populist ranting. We now return to the extremely open ModSAX discussion. (p.s. It looks like David will be giving a presentation on this JSR at XTech. I'll be there, I assume he'll be there, and anyone else who's around and would like to take a close look at this thing should come by at 2:45 on Wednesday. Oh, and did I mention the price of conferences? Never mind, forget I said that.) Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 15:32:39 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:47 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> Message-ID: <36E53DA7.BF547D80@vanderburg.org> John Cowan wrote: > > > public abstract void setFeature (String featureID, boolean state) > > throws SAXNotSupportedException; > > 2) This method is allowed to throw a SAXNewParserException, which > encapsulates a replacement parser. There are two problems with this. First: let's not use exceptions to report non-error conditions. There are theoretical and practical reasons to restrict the use of Java exceptions to reporting errors. (On a related note, I would like to propose an explicit "boolean featureSupported(String featureID)" query method to make it possible to test for a feature without risking an exception. If anyone would like details of why it's bad to have exceptions as a part of normal control flow, let me know.) Second: if an application needs to implement certain features by pushing filters from the bottom, it can encapsulate the entire process on its own, using a composite, and the process never needs to be exposed through the ModSAX API. (I'm new to this discussion, so forgive me --- but let me know --- if I'm rehashing old debates.) ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 15:53:09 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query References: <007801be6a29$d6efdd80$c9a8a8c0@thing2> Message-ID: <36E540FA.61C3F574@vanderburg.org> Bill la Forge wrote: > > From: John Wilson > >I don't think that it's unreasonable to insist that objects > >representing a Feature, Handler or Property should either implement > >a distinct interface or subclass a distinct class. If this is so > >the Parser can tell what Feature, Handler or Property is being set > >by enquiring of the type of the object. > > Filters often implement more than one (generally all) handler > interface and then register themselves with the underlying > parser/filter for the same events requested by the overlaying > application/filter. Yes, and as written, John's proposal would require distinct handler objects for each feature, which would be bad. However, with a slight modification, it would work beautifully. Instead of using a string as a feature ID, use a type descriptor (in Java, an instance of java.lang.Class). Feature handlers would be registered by supplying the Class object that represents the feature being implemented, along with a handler object that is assignable to that type. It seems probable to me that, whatever naming scheme is chosen for features, each feature will have a special interface that handlers must implement; if that's true, and Strings are used to identify features, we will effectively have two names for each feature. And using classes shares one of the good aspects of the URI solution: it piggybacks on the DNS to provide a ready-made collision-free global namespace. The only problem I see with this proposal is that it may not translate well to other languages. One possibility is for other languages to use the name of the corresponding Java interface as a feature name; for example, "org.xml.sax.NamespaceHandler". This may not be ideal, but does not seem too onerous. ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 16:06:34 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <08e801be6a46$bb711d40$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Glenn Vanderburg To: Bill la Forge Cc: John Wilson ; XML Developers' List Sent: 09 March 1999 15:40 Subject: Re: SAX: ModSAX addition, general property query >Bill la Forge wrote: >> >> From: John Wilson >> >I don't think that it's unreasonable to insist that objects >> >representing a Feature, Handler or Property should either implement >> >a distinct interface or subclass a distinct class. If this is so >> >the Parser can tell what Feature, Handler or Property is being set >> >by enquiring of the type of the object. >> >> Filters often implement more than one (generally all) handler >> interface and then register themselves with the underlying >> parser/filter for the same events requested by the overlaying >> application/filter. > >Yes, and as written, John's proposal would require distinct handler >objects for each feature, which would be bad. However, with a slight >modification, it would work beautifully. Instead of using a string >as a feature ID, use a type descriptor (in Java, an instance of >java.lang.Class). Feature handlers would be registered by supplying >the Class object that represents the feature being implemented, along >with a handler object that is assignable to that type. This seems to me to be an excellent suggestion;) John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 16:16:49 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion References: <005001be6a23$7e574240$c9a8a8c0@thing2> Message-ID: <36E547A3.41E513A@eng.sun.com> > >> Looking forward to progress on the Java XML API. BTW, Dave, > >> are you going to do a "Birds of a Feather" session on XML at this years > >> JavaOne? I think that could be valuable. > > > >I may be signed up for more than that this time... > > > >A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. > >Some of the folk here have never met in person. I think > >there will be lots of interesting applications to talk > >about ... and probably some interesting frameworks. > > Simon and I proposed a Coins BOF some time back and > JavaOne accepted it. Might be a good place to meet and > discuss ModSAX, filters, and such. I thought they were doing BOF scheduling on a more typical schedule -- e.g. hold off for a month or two before the conference. Evidently not! I'll encourage someone else to do the legwork on setting up an XML, or XML-DEV, BOF ... I'll gladly show up! It's not looking like something I'll have time to arrange. I'm sure contact information is available via the java.sun.com website. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 9 17:03:20 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:47 2004 Subject: Namespaces and DTDs Message-ID: <01BE6A56.FD03D0D0@grappa.ito.tu-darmstadt.de> Richard L. Goerwitz wrote: > Maybe I misunderstand, but as far as I can see, namespaces won't help > you, either. Why? Because even if you can refer to, say, your two TITLE > elements by different prefixes, you'll still have to declare the prefixed > elements in the DTD as if they were atomic element names. > > Namespaces, in other words, don't solve your problem. They may make it > worse, in fact, because you have to know what prefixes you are going to > declare in a given document to be able to rewrite your DTD to work with > that document. > > There was a furor two or three months ago on this list about namespaces > breaking validation. That furor died down when the namespace spec became > an official recommendation (a done deal, in other words). You are correct. In today's environment (namespace-unaware parsers and no way to associate prefixes and URIs in the DTD), you must use the same prefixes in the DTD and the document for validation to work. I didn't state this because it was stated repeatedly during the aforementioned furor, which I sincerely hope this thread won't reignite. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 9 17:11:41 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> Message-ID: <36E55607.332557F1@locke.ccil.org> Glenn Vanderburg wrote: > First: let's not use exceptions to report non-error conditions. There > are theoretical and practical reasons to restrict the use of Java > exceptions to reporting errors. We should take this off-line. I'll simply say: exceptions are suitable for reporting exceptional conditions. Having an object request its own replacement is certainly exceptional. > Second: if an application needs to implement certain features by > pushing filters from the bottom, The idea here is that an application may request a feature which a parser does not itself support, but can be adapted to support by pushing a filter between itself and the application. That of course requires that the application now talk to the filter instead. (In principle, the parser could act as an adapter for the filter, but that would complicated the bejesus out of it.) In Smalltalk, the parser could swap object ids with the filter using the become: method, but AFAIK no other OO language supports that. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Tue Mar 9 17:21:20 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Message-ID: <36E55BFE.C5DB6816@mecomnet.de> ? which of the "namespace aware" parsers will permit you to parse validate a document for which partions of the dtd contain element declarations with ambiguous names - without first modifying the dtd? i've yet to hear a solution to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. Ronald Bourret wrote: > > Elliotte Rusty Harold wrote: > > > I have several DTDs with conflicting definitions of certain elements. > > (e.g one defines a HEAD as a TITLE followed by a META and another > > defines a HEAD as #PCDATA). I need to use all the DTDs and associated > > markup languages for a single document. > > > > To an extent I can disambiguate them with namespaces. However, is there > > any way I can do this while still validating against the orignal DTDs? > > ... > > If you want to use a namespace-unaware parser, I don't see how you can > avoid rewriting the DTDs. ...l xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 17:22:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> References: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> Message-ID: <14053.22490.504236.874846@localhost.localdomain> John Wilson writes: > I don't think that it's unreasonable to insist that objects representing a > Feature, Handler or Property should either implement a distinct interface or > subclass a distinct class. If this is so the Parser can tell what Feature, > Handler or Property is being set by enquiring of the type of the object. (I > favour insisting that they subclass distinct classes because (in Java) that > naturally imposes the restriction that a single object can only represent a > single Property.) We wouldn't want to have to rely on discovering the class at runtime, so we'd have to have a method in the interface that reports a string ID anyway -- at lot more work for the same result. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 18:41:03 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <098d01be6a5c$57d56190$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 08 March 1999 22:30 Subject: Re: SAX: ModSAX addition, general property query >Tom Harding writes: > > David Megginson wrote: > > > > > As I wrote before, it doesn't much matter whether we use Java property > > > names incorporating domain names (like > > > 'org.xml.sax.features.validation') or URIs (like > > > 'http://xml.org/sax/features/validation'), as long as we have the > > > ability for people to create new names without fear of collision. > > > > I would also urge against using an http: URI since it is not meant > > that a resource actually be retrieved using the http protocol. > >I've been thinking about this issue, and I'm fairly convinced that the >URI is the right choice. I really have a problem with using URI's for this. RFC2396 (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt) section 6 talks about URI Normalisation and equivalence It says that URI equivalence is defined on a scheme basis. You have chosen the http scheme so we are presumably required to apply the http definition of URI equivalence. This does not seem to me to be a desirable criteria for equivalence. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 18:41:04 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: John Wilson Cc: XML Developers' List Sent: 09 March 1999 17:19 Subject: Re: SAX: ModSAX addition, general property query >John Wilson writes: > > > I don't think that it's unreasonable to insist that objects representing a > > Feature, Handler or Property should either implement a distinct interface or > > subclass a distinct class. If this is so the Parser can tell what Feature, > > Handler or Property is being set by enquiring of the type of the object. (I > > favour insisting that they subclass distinct classes because (in Java) that > > naturally imposes the restriction that a single object can only represent a > > single Property.) > >We wouldn't want to have to rely on discovering the class at runtime, >so we'd have to have a method in the interface that reports a string >ID anyway -- at lot more work for the same result. Testing the type at run time is a tivial operation in Java so I'm not sure why you say that we wouldn't want to rely on descovering the class at run time. If there was some worry about the performance hit on iterating through all the supported interfaces (which I strongly doubt) the interface would have a method that reported a Class rather than a String. However, Glen Vanderburg has suggested an amendment to my idea which seems to me to address you concerns. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Mar 9 19:36:41 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:48 2004 Subject: RDF, ID's, XPtrs, and object orientation Message-ID: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com> I am struggling with the following limitation caused by RDF's use of ID attributes: I want to use RDF in a truly object-oriented fashion. It lets me get really close but not quite there. I would like to use the "subPropertyOf" element to indicate overriding. However, since property names are ID's, I can't override by name. I could use XPointer to refer to overridden names (in effect referring to "the property whose name is foo and whose class is bar"), but I can't actually define the bar version of foo and the baz version of foo in the same document. Of course, if I could specify a key composed of multiple attributes, my problems would be solved. I realize I can also avoid the problem by putting each "class" in a separate document, but this causes problems of its own in my particular application. If anyone has a hint as to how to get around this issue, that would be great, otherwise it's just food for thought. Jeff P.S. I am finding the problem of ID conflicts between "fragments" that need to be created separately and then combined into a single document to be a general one. My approach has been not to use ID attributes, but I don't have a choice if I'm using RDF. I suppose it will work as long as I don't validate, but I really want to validate. ----------------------------------------------------------------- Kuantech, Inc. http://www.kuantech.com Jeffrey E. Sussna, Principal jes@kuantech.com Distributed Content Architectures for Dynamic Online Applications ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 19:49:19 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> <36E55607.332557F1@locke.ccil.org> Message-ID: <36E57A78.87779A2C@vanderburg.org> > We should take this off-line. I'll simply say: exceptions are > suitable for reporting exceptional conditions. Having an object > request its own replacement is certainly exceptional. Well, yes and no. But I'd prefer to go the cleaner route of not allowing the object to request its own replacement. > The idea here is that an application may request a feature which > a parser does not itself support, but can be adapted to support > by pushing a filter between itself and the application. Yes, I understand. > That > of course requires that the application now talk to the filter > instead. (In principle, the parser could act as an adapter for > the filter, but that would complicated the bejesus out of it.) It's not complicated at all --- merely a little tedious. It would be easy to provide a class in the helpers package that would make it almost trivial. My primary objection to the idea is precisely what you mentioned above: that it is an extremely unusual thing to happen. Programmers will be surprised by this behavior. Coupled with the fact that it's very easy to make it all transparent, I think exposing the parser's internal tricks is a bad idea. ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 19:52:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <36E540FA.61C3F574@vanderburg.org> References: <007801be6a29$d6efdd80$c9a8a8c0@thing2> <36E540FA.61C3F574@vanderburg.org> Message-ID: <14053.31455.289569.926503@localhost.localdomain> Glenn Vanderburg writes: > It seems probable to me that, whatever naming scheme is chosen for > features, each feature will have a special interface that handlers > must implement This is not the case. Some features will require special handlers, some will allow special handlers, and some will simply change the way existing handlers are used. For example, if you enable validation, you request that the parser report additional error states the existing ErrorHandler; if you enable text-normalisation, you simply ask the parser to guarantee that there will never be two DocumentHandler.characters events in a row. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 19:53:20 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E53DA7.BF547D80@vanderburg.org> References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> Message-ID: <14053.31616.491923.652158@localhost.localdomain> Glenn Vanderburg writes: > Second: if an application needs to implement certain features by > pushing filters from the bottom, it can encapsulate the entire process > on its own, using a composite, and the process never needs to be > exposed through the ModSAX API. This is actually a good point. Since the SAX driver is usually a separate class rather than the parser itself, it would not be difficult for it to encapsulate any needed filters. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 20:18:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> References: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> Message-ID: <14053.33072.300457.335320@localhost.localdomain> John Wilson writes: > Testing the type at run time is a tivial operation in Java ... but not in other programming languages. > so I'm not sure why you say that we wouldn't want to rely on > descovering the class at run time. In the end, you're doing the equivalent of testing for a string anyway -- you're just letting the Java class name serve as the unique ID. I don't see the advantage of forcing the users to get the unique ID through a circuitous route. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 21:24:42 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <09b201be6a73$243c7dc0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 09 March 1999 20:16 Subject: Re: SAX: ModSAX addition, general property query >John Wilson writes: > > > Testing the type at run time is a tivial operation in Java > >... but not in other programming languages. > > > so I'm not sure why you say that we wouldn't want to rely on > > descovering the class at run time. > >In the end, you're doing the equivalent of testing for a string anyway >-- you're just letting the Java class name serve as the unique ID. I >don't see the advantage of forcing the users to get the unique ID >through a circuitous route. You are testing for a value. Testing for a String, a Class or an int are, at that level, equivalent The issue is: how do you chose the value? It so happens that Java provides a natural way for us to create a unique value. Other languages provide other ways of creating the unique value. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Tue Mar 9 21:56:33 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs Message-ID: <9C998CDFE027D211B61300A0C9CF9AB4424719@SNYC11309> Richard L. Goerwitz wrote: >Ronald Bourret wrote: >> > I have several DTDs with conflicting definitions of certain elements. >> > ...Am I going to have to break down and just rewrite the DTDs to use >> > the qualified names? >> >> If you want to use a namespace-unaware parser, I don't see how you can >> avoid rewriting the DTDs. >Maybe I misunderstand, but as far as I can see, namespaces won't help >you, either. Why? Because even if you can refer to, say, your two TITLE >elements by different prefixes, you'll still have to declare the prefixed >elements in the DTD as if they were atomic element names. >Namespaces, in other words, don't solve your problem. They may make it >worse, in fact, because you have to know what prefixes you are going to >declare in a given document to be able to rewrite your DTD to work with >that document. I have a similar problem: On my web site http://www.cogx.com, I am working on XML driven menu bar (can be a tree, etc) The underlying XML uses reusable structures such as months, quarters of the year, Tax schedules with zillions of tax lines repeated, etc. Instead of having just one XML document for the menu bar, I moved reusable fragments into a separate file and access them from my main XML by or it is also obvious that I should not keep all fragments in one reusable collection, but rather separate them by theme. - Why should I send file with tax schedules to a guy interested in Opera performances? So I can have as many reusable collections as I wish: tax related, publications related, theater related, etc... It means I should allow freedom in specifying namespace prefixes and still know what each prefix means! I am achieving this by declaring my namespaces as follows: xmlns:ref="groups:www.cogx.com/xmlbar/ref-menu.xml" the prefix "groups:" tells me that a namespace of reusable fragments was defined Now I can give my prefix any name. When parsing I know that it is a namespace of reusable fragments! Problem here is that element has to be defined with an open model to allow for different namespace prefixes. I also made a proposal that it would be great to reserve "any" prefix for this type of situation. This will save me from using open model, which I do not like, really! > The most effective responses I saw were >from people who said, in effect, "Namespaces do far less than you want >or expect them to." Exactly! And this is why Namespaces let you do much more then you thought you can! Best regards, Nikita O. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Tue Mar 9 22:49:16 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs Message-ID: A simple extension to namespaces could have fixed this problem: 1. Allow a DTD to be optionally specified along with the namespace prefix and URI 2. When an element is prefixed, parse it using the DTD associated with the namespace and the given prefix as the default. 3. If no DTD is associated with the prefix or not validating, do what is done now (ensure element is well-formed). Your DTDs would not need to be changed, you would just have to indicate which HEAD (for example) is desired in the content and add associated DTD urls to the namespace declarations. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] Sent: Tuesday, March 09, 1999 9:02 AM To: xml-dev@ic.ac.uk Subject: RE: Namespaces and DTDs Richard L. Goerwitz wrote: > Maybe I misunderstand, but as far as I can see, namespaces won't help > you, either. Why? Because even if you can refer to, say, your two TITLE > elements by different prefixes, you'll still have to declare the prefixed > elements in the DTD as if they were atomic element names. > > Namespaces, in other words, don't solve your problem. They may make it > worse, in fact, because you have to know what prefixes you are going to > declare in a given document to be able to rewrite your DTD to work with > that document. > > There was a furor two or three months ago on this list about namespaces > breaking validation. That furor died down when the namespace spec became > an official recommendation (a done deal, in other words). You are correct. In today's environment (namespace-unaware parsers and no way to associate prefixes and URIs in the DTD), you must use the same prefixes in the DTD and the document for validation to work. I didn't state this because it was stated repeatedly during the aforementioned furor, which I sincerely hope this thread won't reignite. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 9 23:06:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:48 2004 Subject: RDF, ID's, XPtrs, and object orientation References: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com> Message-ID: <36E5A929.394204F0@locke.ccil.org> Jeffrey E. Sussna wrote: > My approach has been not to use ID attributes, but I don't have a > choice if I'm using RDF. I suppose it will work as long as I don't > validate, but I really want to validate. Actually, the values of ID and bagID attributes have to be unique within the document, but nothing says that either has to be an XML "ID attribute". (That was so in earlier drafts, but not now.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:12:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: ModSAX: Proposed Core Handlers Message-ID: <14053.50619.147376.869177@localhost.localdomain> My current proposal for the ModParser interface includes the following method (ModHandler is an empty interface): public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; I propose the following core handlers, with the understanding that SAX parsers are not required to support any of them (they are free to throw a SAXNotSupportedException): ModSAX Core Handlers -------------------- (All handler IDs correspond to a specific interface.) http://xml.org/sax/handlers/lexical Receive callbacks for comments, CDATA sections, and (possibly) entity references. http://xml.org/sax/handlers/dtd-decl Receive callbacks for element, attribute, and (possibly) parsed entity declarations. http://xml.org/sax/handlers/namespace Receive callbacks for the start and end of the scope of each namespace declaration. I'm not certain, but it might make sense to replace the third one with a read-only parse-time property. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:17:07 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <14053.50863.546824.628181@localhost.localdomain> My current proposal for the ModParser interface includes the following methods: public abstract void set (String propID, Object value) throws SAXNotSupportedException; public abstract Object get (String propID); throws SAXNotSupportedException; Properties may be read-write, read-only, or write-only; they may also be parse-time (may be changed during parsing) or non-parse-time (may be changed only before a parse or between parses). ModSAX Core Properties ---------------------- (All properties are associated with a single value type.) http://xml.org/sax/properties/namespace-sep (write-only) Set the separator to be used between the URI part of a name and the local part of a name when namespace processing is being performed (see the http://xml.org/sax/features/namespaces feature). By default, the separator is a single space. This property may not be set while a parse is in progress (throws a SAXNotSupportedException). http://xml.org/sax/properties/dom-node (read-only) Get the DOM node currently being visited, if the SAX parser is iterating over a DOM tree. If the parser recognises and supports this property but is not currently visiting a DOM node, it should return null (this is a good way to check for availability before the parse begins). http://xml.org/sax/properties/xml-string (read-only) Get the literal string of characters associated with the current event. If the parser recognises and supports this property but is not currently parsing text, it should return null (this is a good way to check for availability before the parse begins). I stole this idea from Expat. Remember that no SAX parser will be required to support any of these -- it simply has to throw a SAXNotSupportedException if it doesn't know about the property. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:17:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Features Message-ID: <14053.51113.676945.877507@localhost.localdomain> Here's my revised version of the core feature list, based on recent discussions: ModSAX Core Features -------------------- http://xml.org/sax/features/validation Validate (true) or don't validate (false). http://xml.org/sax/features/external-general-entities Expand external general entities (true) or don't expand (false). http://xml.org/sax/features/external-parameter-entities Expand external parameter entities (true) or don't expand (false). http://xml.org/sax/features/namespaces Preprocess namespaces (true) or don't preprocess (false). See also the http://xml.org/sax/properties/namespace-sep property. http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:21:08 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed ModParser Interface Message-ID: <14053.51158.347156.718466@localhost.localdomain> Here's my current proposed interface for ModParser: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; public abstract void set (String propID, Object value) throws SAXNotSupportedException; public abstract Object get (String propID) throws SAXNotSupportedException; } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Wed Mar 10 01:24:06 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs In-Reply-To: Your message of "Tue, 09 Mar 1999 14:48:18 -0800." Message-ID: <199903100123.OAA10814@aniwa.sky> How about having the ability to say 'process the children of this element using that dtd'. Attach DTD declarations to elements, not just to documents. It feels like some way is needed to make interpretation of XML subtrees dependent on context, hence not requiring the rewriting of XML imported into a document as a subtree from the context of a different document. (Perhaps I'm being naive. I'm new to this.) Andrew McNaughton > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > > Your DTDs would not need to be changed, you would just have to > indicate which HEAD (for example) is desired in the content and add > associated DTD urls to the namespace declarations. > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] > Sent: Tuesday, March 09, 1999 9:02 AM > To: xml-dev@ic.ac.uk > Subject: RE: Namespaces and DTDs > > Richard L. Goerwitz wrote: > > > Maybe I misunderstand, but as far as I can see, namespaces won't > help > > you, either. Why? Because even if you can refer to, say, your two > TITLE > > elements by different prefixes, you'll still have to declare the > prefixed > > elements in the DTD as if they were atomic element names. > > > > Namespaces, in other words, don't solve your problem. They may make > it > > worse, in fact, because you have to know what prefixes you are going > to > > declare in a given document to be able to rewrite your DTD to work > with > > that document. > > > > There was a furor two or three months ago on this list about > namespaces > > breaking validation. That furor died down when the namespace spec > became > > an official recommendation (a done deal, in other words). > > You are correct. In today's environment (namespace-unaware parsers > and no > way to associate prefixes and URIs in the DTD), you must use the same > prefixes in the DTD and the document for validation to work. I didn't > state this because it was stated repeatedly during the aforementioned > furor, which I sincerely hope this thread won't reignite. > > -- Ron Bourret > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 10 01:30:01 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:49 2004 Subject: Architectural Forms Questions References: <36E3F30C.F6D6DB51@mitre.org> Message-ID: <36E5BA4C.7916D8@prescod.net> "Roger L. Costello" wrote: > > - How powerful is the correspondence that you can express with > Architectural Forms? Is it essentially limited to renaming and > omission? You can also map elements to attributes and attributes to elements. > - In addition to using Architectural Forms to express correspondences > that are known a priori, could you use them to document mappings that > are discovered "on-the-fly" by modifying a document or DTD after a > mapping is discovered? Yes, you can do this by modifying DTDs. Caveat: In my experience it is seldom the case that a subtype relationship can be "discovered" after the fact. It works for really loose DTDs like HTML and ICADD, but not for more complex/strict DTDs. This is very similar to the situation in software development. It is very rarely the case that you can "adapt" an existing class to a newly discovered supertype without radically changing the class or breaking existing code. > - It appears to be the case that the correspondence between A and B must > be documented in a way that keeps the mapping tightly coupled to either > A or B. Are there any plans to represent the correspondence so that it > is not tightly coupled to either A or B? You could think of this as the distinction between subtyping and transformation. Subtyping is about an inherent relationship that is discovered in advance. Transformation is about imposing a mapping externally, "on the fly." > - Is it a correct interpretation to say that Architectural Forms > represent correspondence by overloading existing language constructs? "Overloading" is a somewhat overloaded term. Let's say "reusing" existing language constructs. > - Given that subtyping and inheritance have been part of the primary XML > "schema" proposals, is it likely that XML Architectural Forms will be > overtaken by advances in the XML schema area? Eventually. In what time frame, I don't know. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Wed Mar 10 01:31:13 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs Message-ID: Exactly. By using you would be saying process the HEAD element according to the DTD associated with the namespace prefix 'a' and consider 'a' to be the default namespace for the DTD. If there is no associated DTD, can only check HEAD is well-formed. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Andrew McNaughton [SMTP:andrew@squiz.co.nz] Sent: Wednesday, March 10, 1999 6:23 AM To: Marc McDonald Cc: xml-dev@ic.ac.uk; rbourret@ito.tu-darmstadt.de Subject: Re: Namespaces and DTDs How about having the ability to say 'process the children of this element using that dtd'. Attach DTD declarations to elements, not just to documents. It feels like some way is needed to make interpretation of XML subtrees dependent on context, hence not requiring the rewriting of XML imported into a document as a subtree from the context of a different document. (Perhaps I'm being naive. I'm new to this.) Andrew McNaughton > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > > Your DTDs would not need to be changed, you would just have to > indicate which HEAD (for example) is desired in the content and add > associated DTD urls to the namespace declarations. > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] > Sent: Tuesday, March 09, 1999 9:02 AM > To: xml-dev@ic.ac.uk > Subject: RE: Namespaces and DTDs > > Richard L. Goerwitz wrote: > > > Maybe I misunderstand, but as far as I can see, namespaces won't > help > > you, either. Why? Because even if you can refer to, say, your two > TITLE > > elements by different prefixes, you'll still have to declare the > prefixed > > elements in the DTD as if they were atomic element names. > > > > Namespaces, in other words, don't solve your problem. They may make > it > > worse, in fact, because you have to know what prefixes you are going > to > > declare in a given document to be able to rewrite your DTD to work > with > > that document. > > > > There was a furor two or three months ago on this list about > namespaces > > breaking validation. That furor died down when the namespace spec > became > > an official recommendation (a done deal, in other words). > > You are correct. In today's environment (namespace-unaware parsers > and no > way to associate prefixes and URIs in the DTD), you must use the same > prefixes in the DTD and the document for validation to work. I didn't > state this because it was stated repeatedly during the aforementioned > furor, which I sincerely hope this thread won't reignite. > > -- Ron Bourret > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 10 02:40:06 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: Hi David, In a message dated 3/9/99 8:30:31 PM Eastern Standard Time, david@megginson.com writes: > http://xml.org/sax/properties/dom-node (read-only) > Get the DOM node currently being visited, if the SAX parser is > iterating over a DOM tree. If the parser recognises and supports > this property but is not currently visiting a DOM node, it should > return null (this is a good way to check for availability before the > parse begins). > This has made me realize that I was under a misconception about what the generic get() and set() parser properties would provide in terms of functionality. What I was really hoping for was: org.w3c.dom.Document parse(InputSource is, boolean events) throws SAXException; org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws SAXException; /* the events boolean would be to turn on/off event calls. */ Which would allow me to code: try { ModParser mp = ParserFactory.makeModParser(); boolean supported = true; try { mp.setFeature("http://xml.org/sax/features/dom-result", true); } catch (SAXNotSupportedException snse) { supported = false; } if (supported) { Document d = mp.parse("test.xml", false); // ... process Document } } catch (SAXException se) { // handle it } So, what I'm saying is that I would like to be able to choose whether to interface to the Parser via events or via a DOM. If you agree with this, I believe using the return type is more appropriate than getting a resultant property (as I suggest next). If for some reason the above is not palatable, the same could be accomplished under the current scheme if we added a property: http://xml.org/sax/properties/dom-document (read-only) Then I could code: try { ModParser mp = ParserFactory.makeModParser(); boolean supported = true; try { mp.setFeature("http://xml.org/sax/features/dom-capable", true); } catch (SAXNotSupportedException snse) { supported = false; } if (supported) { mp.parse("test.xml"); Document d = (Document) mp.get("http://xml.org/sax/properties/dom- document"); // ... process Document } } catch (SAXException se) { // handle it } Note: both code examples also required an added feature to check for the desired functionality. I believe the above is sorely missing from the current API. Does anyone else see a need for this? If not, why not? But before you say, "build a layer on top of SAX" -- to me that seems ridiculous when most of the Parser implementations can produce a dom Document. Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 04:23:41 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <013601be6aac$f96ad580$c9a8a8c0@thing2> From: MikeDacon@aol.com >This has made me realize that I was under a misconception about >what the generic get() and set() parser properties would provide in >terms of functionality. What I was really hoping for was: > >org.w3c.dom.Document parse(InputSource is, boolean events) throws >SAXException; >org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws >SAXException; >/* the events boolean would be to turn on/off event calls. */ I think you have this capability without the extra parameter, since you don't get events unless you register a handler to receives them. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From GAjitK at dbss.com Wed Mar 10 08:38:28 1999 From: GAjitK at dbss.com (George, Ajit Kumar (CTS)) Date: Mon Jun 7 17:09:49 2004 Subject: No subject Message-ID: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA> Hi, I am new to the XML and Java. I am trying to display a XML document in a tree structure using XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements, but how do I get the text content out of the element So I do have a NodeList and I am able to iterate through it, but I am not able to figure out a way to get the content information out of it. I could appreciate any help in this. I will not be using Microsoft parser classes. regards Ajit GAjitK@dbss.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From leich at wiwi.uni-marburg.de Wed Mar 10 09:04:02 1999 From: leich at wiwi.uni-marburg.de (Steffen Leich) Date: Mon Jun 7 17:09:49 2004 Subject: your mail In-Reply-To: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA> Message-ID: On Wed, 10 Mar 1999, George, Ajit Kumar (CTS) wrote: > Hi, > > I am new to the XML and Java. I am trying to display a XML document in a > tree structure using > XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements, > but how do I > get the text content out of the element > > So I do have a NodeList and I am able to iterate through it, but I am not > able to figure out a > way to get the content information out of it. > > I could appreciate any help in this. I will not be using Microsoft parser > classes. > Hi, check out the following URLs: http://www.software.ibm.com/xml/education/buildappl/xml_to_html.html http://www.alphaworks.ibm.com/forum/xmlforjava.nsf/discussion_vert (Discussion of and Links to Tutorials) http://developerlife.com/xmljavatutorial1 Steffen ___________________________________________________ Steffen Leich Phone: +49-6421-283144 leich@wiwi.uni-marburg.de Universitaet Marburg Informations- und Kommunikationsdienste FB 02 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 10 09:16:04 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs Message-ID: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> james anderson wrote: > ? which of the "namespace aware" parsers will permit you to parse validate a > document for which partions of the dtd contain element declarations with > ambiguous names - without first modifying the dtd? i've yet to hear a solution > to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. Good point -- it was unfair of me to blame the parsers here. It all seems rather obvious now: Q. Why were namespaces invented? A. To disambiguate duplicate names. Q. I have a DTD with duplicate names. How do I disambiguate them? A. Use namespaces. The only inobvious bit is that, because there is no way to declare namespaces in the DTD, you can't declare different default namespaces for different parts of the DTD, which would have solved Elliotte's problem rather neatly. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Wed Mar 10 09:19:07 1999 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:09:49 2004 Subject: Architectural Forms Questions In-Reply-To: <36E5BA4C.7916D8@prescod.net> References: <36E3F30C.F6D6DB51@mitre.org> Message-ID: <3.0.6.32.19990310090702.0097ce90@gpo.iol.ie> >"Roger L. Costello" wrote: > - Given that subtyping and inheritance have been part of the primary XML > "schema" proposals, is it likely that XML Architectural Forms will be > overtaken by advances in the XML schema area? > I believe and hope this is true. The mapping that AFs enable is too limiting in my experience. Case in point: at XML 98 in Chicago the GCA issued a DTD for paper submissions. I wrote a paper for that confernence using XML. Along comes XML Europe 99 a variation on the DTD for paper submissions. Even this mapping between two DTDs from the same broad organization in the same ballpark of document types cannot be done with AFs. At least not with my cerebral cortex. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From l-arcini at uniandes.edu.co Wed Mar 10 10:02:51 1999 From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.) Date: Mon Jun 7 17:09:50 2004 Subject: Req:Music DTD(?) Message-ID: <36E644E5.6E728D40@uniandes.edu.co> Hello to all, I'm currently working on a xml-based sequencer, and I would like to see some music notation DTDs, before I start to write my own. I've searched the web high and low... no luck so far, so and I was wondering if any of you guys have any pointer I could use. Thanks in advance Fabio -- Fabio Arciniegas A. Ingenieria de Sistemas Uniandes xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From reschke at medicaldataservice.de Wed Mar 10 10:23:53 1999 From: reschke at medicaldataservice.de (Julian Reschke) Date: Mon Jun 7 17:09:50 2004 Subject: XML query engines Message-ID: <000d01be6ae0$8a0da080$2e00a8c0@julian> At Sun, 31 Jan 1999 16:53:32 -0800, Tim Bray (tbray@textuality.com) wrote: >At 08:24 PM 1/31/99 +73900, John Cowan wrote: >>Assign a sequentially increasing number to each *tag* (start-tag or end-tag) >>in the document, treating an empty tag as a start-tag followed by an >>end-tag. Then e1 is a descendant of e2 iff e1.start > e2.start >>and e1.end < e2.end. Also, e1 is a left sibling of e2 (and e2 is >>a right sibling of e1) iff e1.end + 1 = e2.start; e1 is the leftmost >>child of e2 iff e1.start = e2.start + 1. Modeling the child/parent >>relationship is not so easy, and requires iteration. > >This structure has all sorts of advantages; that's how the >Open Text SGML-savvy search engine of yore used to run. Fast as >ell, equal access to any & all elements without performance >penalty. > > >But hard to update. Is there an easy way to apply this model to a MSXML.DLL DOM object? Microsoft's documentation (uniqueID Method, elementIndexList Method) is not very clear about how these IDs are generated, and whether they remain the same across to separate parser invocations on the same XML data... -- Julian Reschke MedicalData Service GmbH (http://www.medicaldataservice.de) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 10:34:43 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: Message-ID: <36E64E4F.83649621@mecomnet.de> That "REC-xml-names-19990114" does not provide any means to establish prefix<->uri bindings for a DTD has long been a point of contention. A cursory search of the archives will bear this out. The decision to eliminate the combined prefix/uri/dtd binding (the original pi form) was, however, correct, as the pi form, at least as proposed in "WD-xml-names-19980327", would not have been sufficient to handle such things as a dtd which needs multiple prefix bindings or the situation where a given prefix<->uri binding is to apply to multiple schema sources. While it is true that some mechanism is necessary, a form - as discussed below - which effected a singular binding would also not have solved the problem. "Everyone" would seem to be waiting for "schemas".... Marc.McDonald@Design-Intelligence.com wrote: > > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 11:30:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Properties In-Reply-To: References: Message-ID: <14054.21854.948934.185758@localhost.localdomain> MikeDacon@aol.com writes: > So, what I'm saying is that I would like to be able to choose > whether to interface to the Parser via events or via a DOM. If you > agree with this, I believe using the return type is more > appropriate than getting a resultant property (as I suggest next). This is easy enough to build on top of SAX, but I think that it's probably out of scope for SAX itself. SAX is meant to be a relatively simple, low-level layer that people can build on. > If for some reason the above is not palatable, the same could be > accomplished under the current scheme if we added a > property: > > http://xml.org/sax/properties/dom-document (read-only) The nice thing about ModSAX is that you're free to try this yourself -- just define a property like http://www.aol.com/mdaconta/props/dom-document (or whatever URL you can use based on your AOL account) and let the market decide whether to support it. Perhaps one of the people who has written a higher-level utility package that supports both SAX and DOM would like to use this or something like it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Wed Mar 10 12:05:42 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:50 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E4C4E6.B51DDFF3@eng.sun.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E4C4E6.B51DDFF3@eng.sun.com> Message-ID: * David Megginson | | http://xml.org/sax/features/normalize-text * David Brownell | | This is a good filter feature, I think. I agree. | Lars suggested a "Catalog" feature. There are different sorts of | catalog, and they need configuration, so the value of this could be | a URI for the catalog, not just a boolean. There should be a catalog parameter as well, but the reason I proposed this as a feature rather than just as a parameter is that SP and xmlproc both allow you to use environment variables to point to a default catalog file, which is rather handy. So it would definitely be useful to be able to tell the parser, go read the default catalog, wherever it is. (Or don't.) Java parsers could use a Java property to achieve the same thing. BTW: I'm surprised that David Megginson hasn't replied to this. David, Some kind of confirmation that you've at least seen this would be welcome. (I know majordomo isn't 100% trustworthy, so it might have disappeared on the way.) | Plus, this would seem to be up to the "EntityResolver" to handle | ... yes? Sort of. You could make a parser filter that used an entity resolver to do this in general. xmlproc has an internal PubIdResolver interface which it uses for this (and which is also exposed as the EntityResolver when using SAX). | It'd perhaps suggest that one could ask the next filter in the | stream for the resolver it was using ... :-) Hmmm. This is actually potentially troubling, since one would need to specify how a catalog EntityResolver and a custom one specified to be used together should work. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 10 12:21:56 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <4449a8bc.36e66305@aol.com> Hi Bill, In a message dated 3/9/99 11:37:51 PM Eastern Standard Time, b.laforge@jxml.com writes: > >org.w3c.dom.Document parse(InputSource is, boolean events) throws > >SAXException; > >org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws > >SAXException; > >/* the events boolean would be to turn on/off event calls. */ > > > I think you have this capability without the extra parameter, since you don't > get events unless you register a handler to receives them. > Since there is already a parse(InputSource) and parse(String) method in the interface, in order to overload it we need a second parameter. The events parameter was the first one that came to mind, there may be a better one. Best wishes, - Mike Mike Daconta (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Mar 10 13:16:35 1999 From: richard at goon.stg.brown.edu (Richard Goerwitz) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E6704E.A13B3890@goon.stg.brown.edu> Ronald Bourret wrote: > The only inobvious bit is that, because there is no way to declare > namespaces in the DTD, you can't declare different default namespaces > for different parts of the DTD Because the DTD is not namespace aware, all it can deal with are the pre- fixes you declare (not the URLs associated with them). Since these pre- fixes are declared in the document content, you end up with a peculiar situation in which the DTD has to be written according to declarations in a given document instance, rather than the reverse. Worse yet, there is no way to be sure that the various documents being validated against a particular DTD use the prefixes correctly, with the correct URLs, un- less you make extensive use of attribute defaults - which, ironically, means we now need the DTD (probably an external one, typically with a bunch of parameter entities; so get your validating parser ready). After another year or two of this, with alternate schemas floating around besides DTDs, with architectural forms, with namespaces, and what not - after all of this, I wonder if we'll all, in good conscience, be able to say that anything has been simplified. (Simplicity _was_ one of XML's primary goals back in the dark ages last February.) In reality, XML is functioning less like a "simplification," and more like a political move intended to facilitate changes that could never have been made to a mature standard like SGML. This is actually a very old story that's been repeated many times over. (Just look at what's happened to LDAP. By the time we get all the PKI and ACL extensions in place, it's really not going to be very L.) In the end, LDAP and XML may end up serving their constituencies better than their predecessors did. Or they may not. Frankly, with regard to XML, the jury is still out. It's not catching on nearly as fast as pre- dicted a year or two ago. And it's taking considerably more work to im- plement it than anybody ever envisioned. Those of us who have done the work of writing XML processing software, and of making it work, have a right to say this. The emperor may or may not have clothes. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Wed Mar 10 13:36:47 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:50 2004 Subject: Req:Music DTD(?) In-Reply-To: Your message of "Wed, 10 Mar 1999 05:09:41 CDT." <36E644E5.6E728D40@uniandes.edu.co> Message-ID: <199903101333.CAA06775@aniwa.sky> > Hello to all, > I'm currently working on a xml-based sequencer, and I would like to see > some music notation DTDs, before I start to write my own. I've searched > the web high and low... no luck so far, so and I was wondering if any of > you guys have any pointer I could use. You need a new search engine. I've recently been using www.google.com with results an order of magitude better than what I got from altavista (though altavista still has it's place for more complex query definitions). Try this url: http://www.googlebot.com/search?q=music+dtd Andrew McNaughton Disclaimer: I have nothing to do with google.com, I'm just impressed by their service -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 14:24:01 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E683F7.429E4B25@mecomnet.de> yes; agreement on all points. mr. harold is not the only one who would have benefitted. the only aspect of which i can comprehend, is the claim, that, being able to bind the prefixes over a dtd would have broken the rule that namespaces should not "change the validity of a given document". which claim is true, but which i believe to be fundamentally misdirected. it's an old argument. Ronald Bourret wrote: > > james anderson wrote: > > > ? which of the "namespace aware" parsers will permit you to parse > validate a > > document for which partions of the dtd contain element declarations with > > ambiguous names - without first modifying the dtd? i've yet to hear a > solution > > to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. > > Good point -- it was unfair of me to blame the parsers here. It all seems > rather obvious now: > > Q. Why were namespaces invented? > A. To disambiguate duplicate names. > > Q. I have a DTD with duplicate names. How do I disambiguate them? > A. Use namespaces. > > The only inobvious bit is that, because there is no way to declare > namespaces in the DTD, you can't declare different default namespaces for > different parts of the DTD, which would have solved Elliotte's problem > rather neatly. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 14:52:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: References: <14051.3215.196642.22571@localhost.localdomain> <36E4C4E6.B51DDFF3@eng.sun.com> Message-ID: <14054.34184.693965.347827@localhost.localdomain> Lars Marius Garshol writes: > | Lars suggested a "Catalog" feature. There are different sorts of > | catalog, and they need configuration, so the value of this could be > | a URI for the catalog, not just a boolean. > > There should be a catalog parameter as well, but the reason I proposed > this as a feature rather than just as a parameter is that SP and > xmlproc both allow you to use environment variables to point to a > default catalog file, which is rather handy. > > So it would definitely be useful to be able to tell the parser, go > read the default catalog, wherever it is. (Or don't.) Java parsers > could use a Java property to achieve the same thing. > > BTW: I'm surprised that David Megginson hasn't replied to this. > David, Some kind of confirmation that you've at least seen this > would be welcome. (I know majordomo isn't 100% trustworthy, so it > might have disappeared on the way.) Please don't be surprised -- depending on how new a suggestion is, sometimes I like to sit back and hear different people's opinions for a few hours or a few days before blurting out my own. On this topic, I'm a little uncomfortable putting in a core feature for catalogues when XML catalogue formats haven't settled yet (likewise, I don't include a feature for data typing, though some kind of data typing will undoubtedly arrive before long). It would probably make more sense for the promoters of different catalogue formats to define their own properties and/or features, such as http://www.oasis.org/sax/features/entity-catalog That way, we won't have any unpleasant surprises when a user expects a parser to use one type of catalogue and the parser finds another instead. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Mar 10 15:07:12 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features Message-ID: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> At 08:16 PM 3/9/99 -0500, David Megginson wrote: >Here's my revised version of the core feature list, based on recent >discussions: This seems to be converging nicely. Any chance of losing the ugly "Mod" prefix? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 15:09:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> References: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> Message-ID: <14054.35485.843066.25717@localhost.localdomain> Tim Bray writes: > At 08:16 PM 3/9/99 -0500, David Megginson wrote: > >Here's my revised version of the core feature list, based on recent > >discussions: > > This seems to be converging nicely. Any chance of losing the > ugly "Mod" prefix? -Tim Yeah, no one seems to like it but me. Any other suggestions? I don't like Parser2 or things like that, because I want to emphasise that this is an add-on to SAX 1.0 rather than an upgrade. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mgoulde at psgroup.com Wed Mar 10 16:24:26 1999 From: mgoulde at psgroup.com (Michael Goulde) Date: Mon Jun 7 17:09:50 2004 Subject: Music DTD(?) Message-ID: <71A71A050B7BD111838300805F579504926776@psgroup.com> Check out: http://www.tcf.nl/3.0/musicml/index.html Michael Goulde Executive Vice President Research and Services Patricia Seybold Group 85 Devonshire St., 5th Floor Boston, MA 02109 Tel: 617 742-5200 Order "Customers.com" by Patricia Seybold with Ronni Marshak today from Amazon.com -----Original Message----- From: Fabio Arciniegas A. [mailto:l-arcini@uniandes.edu.co] Sent: Wednesday, March 10, 1999 5:10 AM To: XML Mailing List Subject: Req:Music DTD(?) Hello to all, I'm currently working on a xml-based sequencer, and I would like to see some music notation DTDs, before I start to write my own. I've searched the web high and low... no luck so far, so and I was wondering if any of you guys have any pointer I could use. Thanks in advance Fabio -- Fabio Arciniegas A. Ingenieria de Sistemas Uniandes xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 16:26:54 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E6704E.A13B3890@goon.stg.brown.edu> Message-ID: <36E6A0D0.4C4DA307@mecomnet.de> all of which presumes that you've elevated prefixes to the status of uri's - attribute defaults or not. Richard Goerwitz wrote: > > Ronald Bourret wrote: > > > The only inobvious bit is that, because there is no way to declare > > namespaces in the DTD, you can't declare different default namespaces > > for different parts of the DTD > > Because the DTD is not namespace aware, all it can deal with are the pre- > fixes you declare (not the URLs associated with them). Since these pre- > fixes are declared in the document content, you end up with a peculiar > situation in which the DTD has to be written according to declarations > in a given document instance, rather than the reverse. Worse yet, there > is no way to be sure that the various documents being validated against > a particular DTD use the prefixes correctly, with the correct URLs, un- > less you make extensive use of attribute defaults - which, ironically, > means we now need the DTD (probably an external one, typically with a > bunch of parameter entities; so get your validating parser ready). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 10 16:57:12 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6B1F.58595400@grappa.ito.tu-darmstadt.de> David Megginson writes: > Yeah, no one seems to like it but me. Any other suggestions? I don't > like Parser2 or things like that, because I want to emphasise that > this is an add-on to SAX 1.0 rather than an upgrade. It's a bit long, but how about ExtendedParser? (Actually, I'm rather fond of Parser2 because it gives us a clear path should this be extended in the future.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lloyd at digitaljam.com Wed Mar 10 16:58:30 1999 From: lloyd at digitaljam.com (Lloyd Harding) Date: Mon Jun 7 17:09:51 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <36E68B39.DF535797@digitaljam.com> Lars Marius Garshol wrote: > > * Bill la Forge > > | So that's why I'm butting in here. I think an open standards process > | is important for individuals and small companies. We need to do what > | we can to keep the ball rolling here. > > We are certainly in heartfelt agreement here. :) David Brownell wrote: Gee, as a wage-slave working for a big company, I hope that I'm not _too_ excluded from the discussions ... :-) Seriously: my personal model is a lot more akin to the original IETF style "running code and working consensus" model than most existing standards bodies. I'm a lot happier with standards that come from such a process than from ones that involve fat specs that can't be implemented. Writing code is generally more fun than specs -- though an elegant spec is also a work of art! - - Dave Standards processes require effort and in all cases the effort is primarily provided by individuals from large companies. Small companies do not have the resources to put into standards efforts. Voting members make the difference and they are typically not small company employees. That is not to say standards bodies do not have methods for non-voting input. They all do. There are as many defacto standards that have failed as there are planned standards that have failed. There are as many defacto standards that have succeeded as there are planned standards that have succeeded. To claim one is better than another without details is not sufficient. Personal perception might be based on the the differences in methods for receiving input or differences in the scope or differences in personal preference regarding process. But claiming one is better than the other based on failure/success rate requires more detail regarding definitions of failure/success and analysis of history to be convincing. I believe the issue is not so much which method is best but rather WHEN method A is better than method B. Implementation first versus specification first is similar to deduction versus induction. Both have their places the question is when. lloyd -- ---------------------------------------------------------------- Lloyd Harding lloyd@infoauto.com ---------------------------------------------------------------- Information Assembly Automation Inc. http://www.infoauto.com SGML/XML Services for the Publishing and Medical Community Architectural Design, DTD Creation, Editorial System Development ---------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 17:21:14 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002b01be6b1a$eb1fbcc0$c8a8a8c0@thing1> OK, Dave, you asked for it. As an add on, you have made the SAX parser much more eXtensible. As if we didn't have enough X's... XParser Bill -----Original Message----- From: David Megginson To: XML Developers' List Date: Wednesday, March 10, 1999 12:00 PM Subject: Re: ModSAX: Proposed Core Features >Tim Bray writes: > > > At 08:16 PM 3/9/99 -0500, David Megginson wrote: > > >Here's my revised version of the core feature list, based on recent > > >discussions: > > > > This seems to be converging nicely. Any chance of losing the > > ugly "Mod" prefix? -Tim > >Yeah, no one seems to like it but me. Any other suggestions? I don't >like Parser2 or things like that, because I want to emphasise that >this is an add-on to SAX 1.0 rather than an upgrade. > > >All the best, > > >David > >-- >David Megginson david@megginson.com > http://www.megginson.com/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 17:26:32 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <003001be6b1b$c4915360$c8a8a8c0@thing1> On a more serious note, I think we need a new ParserFactory... ModParserFactory? XParserFactory? It should use ParserFactory to create a Parser and then check to see if the new extension is supported. If not, it proceeds to wrap the parser so that it looks like a ModParser. Note that this compatibility wrapper will effectively be a filter. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Wed Mar 10 17:45:10 1999 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: Your message of "Wed, 10 Mar 1999 07:09:59 PST." <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> Message-ID: <199903101744.SAA01077@chimay.loria.fr> tbray@textuality.com said: ] This seems to be converging nicely. Any chance of losing the ugly ] "Mod" prefix? -Tim Why not XSAX for eXtended SAX ? Pat. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Wed Mar 10 17:50:54 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01b001be6b1e$938bdbc0$c5010180@p197> >Why not XSAX for eXtended SAX ? "E-SAX" would be less confusing. -----Original Message----- From: Patrice Bonhomme To: XML Developers' List Date: 10 March 1999 17:48 Subject: Re: ModSAX: Proposed Core Features > >tbray@textuality.com said: >] This seems to be converging nicely. Any chance of losing the ugly >] "Mod" prefix? -Tim > >Why not XSAX for eXtended SAX ? > >Pat. > >-- > ============================================================== > bonhomme@loria.fr | Office : B.228 > http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 > -------------------------------------------------------------- > * Serveur Silfide : http://www.loria.fr/projets/Silfide > ============================================================== > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From luke at javagroup.org Wed Mar 10 18:41:53 1999 From: luke at javagroup.org (Luke Gorrie) Date: Mon Jun 7 17:09:51 2004 Subject: Generating typed code from DTDs, why not? Message-ID: Hi all, I'm pretty new to XML, but as I've poked around I've observed what seem to be some strange things. XML parsers all seem to provide interfaces which ignore the static structure information provided by DTDs and rely on "one fits all" interfaces to elements, in stark contrast to the conventions of statically typed languages. For instance, the first thing I played with in XML was SAX using Python. I was impressed by how easily it worked and how naturally it fit in with a dynamically typed language like python. Then I had a look at the Java interface and found that it was just the same, which I thought very odd! The natural mapping for SAX onto Java, to get the (significant) benefits of static typing, would be to generate a Visitor interface. The Visitor interface would have a method for "visiting" each type of element in the document, and the argument to this method would be an object which presents the element contents through typed accessor methods. At least, that's how it looks to me. In the case of DOM, again generating typed accessor code would provide these great benefits. People could use a DTD (or similar) as the definition language for their abstract data types, and generate DOM-compliant classes which they can both use "natively" in their language and also manipulate as part of a genuine DOM tree at the same time. It seem like these methods which ignore the wealth of static structure information available will begin to show serious problems if they try to scale to the features proposed in some specifications like SOX, where more fine grained relationships and constraints can be expressed. So, my question is: are there any efforts around working towards creating mappings from DTD or other other XML type definition languages to various programming languages (or to other IDLs like OMG's), or is there some reason why this is considered a bad idea? I'm excited by the possibility of using a visual modelling tool (perhaps using an extension of the UML) to model document structure, and from the model be able to generate a DTD, from which to generate classes which give me access to the XML data in a natural way for programming language. I'm amazed that more people don't seem to share this enthusiasm. What we're doing with vanilla DOM and SAX interfaces seems analogous to using CORBA IDL as documentation, and making all object calls using the dynamic invocation interface! P.S. I was told today that Oracle have recently done something similar to this, which sounds great. I look forward to taking a look, but I can't help but wonder if there's a reason that it took this long - and how much the Oracle product does. If someone could point me to some other products which do similar things, I'd be much obliged. Cheers, Luke xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Wed Mar 10 19:16:58 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:51 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <3601b6f9.100299@smtpgate1.ONE2ONE.CO.UK> Hi all, I am using IBM's xmlj2.0.3 XML parsers. I am having the following problem: When i set my own document handler with a DOMParser, the handler is never invoked upon. However when i use the SAXParser it does. Why does the DOMParser not invoke the DocumentHandler yet hte SAXParser does? The docs does not throw any light on the problem. Is there a fundamental problem with using a DocumentHandler with a DOMParser? -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Wed Mar 10 19:24:11 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:51 2004 Subject: WIDL Message-ID: <001001be6b2b$7d49d8f0$01010101@development.cascade> Is anybody doing a B2B/WIDL type of application? Will I be able to use regular HTML pages (and maybe CGI/Pearl) to push and pull XML from a remote server, and easily be able to parse the XML on both sides, looking for custom request/reply types of data and then act on it (via JavaScript or applets on the client, and maybe servlets on the server? Am I dreaming to think that this can give me a light weight remoting technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc. Chad Adams Payback Training Systems Email: cadams@cascadecc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Wed Mar 10 20:21:41 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:09:51 2004 Subject: WIDL In-Reply-To: <001001be6b2b$7d49d8f0$01010101@development.cascade> Message-ID: <199903102020.VAA15937@sonne.darmstadt.gmd.de> Chad Adams wrote at 10 Mar 99, 12:23: > Is anybody doing a B2B/WIDL type of application? There is a lot of research going on with in the area of wrapper generation. Some approaches prefer creating java objects, others directly map to XML. Implementations include: http://db.cis.upenn.edu/W4F/ http://www.cse.ogi.edu/DISC/XWRAP/ http://www.darmstadt.gmd.de/oasys/projects/jedi/jedie.html Just look at the bibliographies to find others. > Am I dreaming to think that this can give me a light weight remoting > technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc. Have a look at XML query languages, they are about that (among other things). http://www.w3.org/TandS/QL/QL98/ A good paper to start with is from David Maier, look at sections 2.9 and 2.10 to see his Vision of data communication via XML on the web. http://www.w3.org/TandS/QL/QL98/pp/maier.html And of course there is Microsoft's vision, see http://www.oasis-open.org/cover/bosworthXML98.html Hope that helps. ++im -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Wed Mar 10 20:38:05 1999 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...) References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> <14052.19853.887104.987727@localhost.localdomain> Message-ID: David> David Megginson [I accidentally mailed this to David; it was meant for the list. Sorry, David.] 0> In article <14052.19853.887104.987727@localhost.localdomain>, David 0> wrote: David> I've been thinking about this issue, and I'm fairly convinced David> that the URI is the right choice. I agree with this much. David> Think of the URI a statement of ownership. Assume that my ISP David> is host.net, and that I've been allocated 5MB of web space at David> http://host.net/foo/. Okay, you own that name subspace *at this moment in time*. Who will have the right to create names below that next March? Five years from now? A hundred years from now? Persistent uniqueness of names is the core work of the URN group, and the consensus there is that DNS names are a poor basis for any kind of URN (and what we want is exactly what URNs are for: naming things). If you are saying that the use of URLs as names is just a stopgap until the URN registration stuff is sorted, then I'll accept that, but be aware of the precedent you're setting with the initial "well-known" feature names. David> I am the only one who has the right to make a resource available at David> http://host.net/foo/, so I am the one who has the (moral) right to David> construct feature IDs based on http://host.net/foo/. At this instant... David> It is not sufficient simply to use the domain name "host.net", David> because I don't own the domain (someone else could construct David> the same feature ID), and it is not sufficient to use something David> starting with net.host.foo, because I *don't* have the right to David> make something available at, say, ftp://host.net/foo/ -- Nor do you own the host "foo.host.net" In summary, I think URNs are a good fit, but not necessarily other kinds of URI. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Wed Mar 10 22:08:35 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <199903101744.SAA01077@chimay.loria.fr> References: Message-ID: <4.1.19990311090304.00c96360@steptwo.com.au> At 03:44 11/03/1999 , Patrice Bonhomme wrote: | | tbray@textuality.com said: | ] This seems to be converging nicely. Any chance of losing the ugly | ] "Mod" prefix? -Tim | | Why not XSAX for eXtended SAX ? Damn, you beat me to it. Although I was thinking SAX eXtended, ie: SAXX This could later become SAXXX or perhaps: 3 SAX J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Wed Mar 10 22:33:02 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:51 2004 Subject: Namespaces and DTDs Message-ID: For a more complete solution than the option (emphasize option) of a DTD associated with a namespace prefix and URI, I would add the means to declare a namespace, prefix and DTD in a DTD. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: james anderson [SMTP:James.Anderson@mecomnet.de] Sent: Wednesday, March 10, 1999 8:42 AM To: xml-dev@ic.ac.uk Subject: Re: Namespaces and DTDs all of which presumes that you've elevated prefixes to the status of uri's - attribute defaults or not. Richard Goerwitz wrote: > > Ronald Bourret wrote: > > > The only inobvious bit is that, because there is no way to declare > > namespaces in the DTD, you can't declare different default namespaces > > for different parts of the DTD > > Because the DTD is not namespace aware, all it can deal with are the pre- > fixes you declare (not the URLs associated with them). Since these pre- > fixes are declared in the document content, you end up with a peculiar > situation in which the DTD has to be written according to declarations > in a given document instance, rather than the reverse. Worse yet, there > is no way to be sure that the various documents being validated against > a particular DTD use the prefixes correctly, with the correct URLs, un- > less you make extensive use of attribute defaults - which, ironically, > means we now need the DTD (probably an external one, typically with a > bunch of parameter entities; so get your validating parser ready). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Wed Mar 10 22:42:35 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:09:51 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <30649320C177D111ADEC00A024E9F297169F8A@exchange-server.dega.com> Hi all, I am using IBM's xmlj2.0.3 XML parsers. I am having the following problem: When i set my own document handler with a DOMParser, the handler is never invoked upon. However when i use the SAXParser it does. Why does the DOMParser not invoke the DocumentHandler yet hte SAXParser does? The docs does not throw any light on the problem. Is there a fundamental problem with using a DocumentHandler with a DOMParser? One2One LUCIO.PICCOLI@one2one.co.uk I don't know about the version of your XML4J, but in mine (1.1.9), the documentation states that DocumentHandler is to be used with the SAX Parser to eb informed of parsing events. This is logical, since the main difference is that DOM parsers parse the whole document into a resulting DOM tree, and SAX parsers are used for event based processing. There doesn't appear to be any way to create a DocumentHandler on class com.ibm.xml.parser.Parser, but you can from org.xml.sax.DocumentHandler. Did they change this in your newer version? Ed Ed Howland ed@dega.com http://www.dega.com "As your attorney, I advise you to take some adrenalchrome" -----Original Message----- From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk] Sent: Wednesday, March 10, 1999 11:12 AM To: xml-dev@ic.ac.uk Subject: DocumentHandler with xml4j DOMParser xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Mar 10 22:44:39 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> > > This seems to be converging nicely. Any chance of losing the > > ugly "Mod" prefix? -Tim > >Yeah, no one seems to like it but me. Any other suggestions? I don't >like Parser2 or things like that, because I want to emphasise that >this is an add-on to SAX 1.0 rather than an upgrade. I have been tracking the progress of 'ModSAX' closely as well and it seems the extension is maturing nicely. BTW, it would help great in renaming if you could tell us what 'Mod' in ModParser stands for. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Mar 10 23:41:33 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> Message-ID: <36E70238.18FC6360@manhattanproject.com> Don Park wrote: > BTW, it would help great in renaming if you could tell us what 'Mod' in > ModParser stands for. I thought it stood for "Modular" :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Mar 11 00:06:18 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:51 2004 Subject: DOM Impl: Array or Linked List? References: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK> Message-ID: <36E7080F.7EEF4CEB@manhattanproject.com> I've been struggling with this slightly, and would like your feedback. I'm building a DOM tree. For the internal representation, I see two options: A) A linked list for children * Easy inserts in middle of list * Slower non-sequential reads B) An array for children * Harder inserts in middle of list * Faster non-sequential reads Anyway, I was thinking of implementing a compromise, a sparse array with configurable spacing, depending upon the document. Thoughts? Thank you. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 00:48:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> Message-ID: <14055.4677.914570.392597@localhost.localdomain> Don Park writes: > BTW, it would help great in renaming if you could tell us what > 'Mod' in ModParser stands for. It means that it's not a Rocker. Or else it means 'modular' -- I'm not sure. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Thu Mar 11 00:54:36 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:52 2004 Subject: Namespaces and DTDs Message-ID: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Richard Goerwitz wrote: > (Simplicity _was_ one of XML's primary goals back in the dark ages last > February.) It seems to me that the SGML compatibility requirement killed simplicity. (And gave a very confusing and hard-to-learn vocabulary) I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent) will be able to influence XML in a positive direction, making it simpler to understand, use and implement. Today it's way too hard to "fully" understand XML. Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 01:04:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...) In-Reply-To: References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> <14052.19853.887104.987727@localhost.localdomain> Message-ID: <14054.56958.252482.1690@localhost.localdomain> [originally sent privately to Tony] Toby Speight writes: > If you are saying that the use of URLs as names is just a stopgap > until the URN registration stuff is sorted, then I'll accept that, > but be aware of the precedent you're setting with the initial > "well-known" feature names. The quality of URNs will depend entirely on the quality of the registration schemes -- URNs really have no inherent advantage over URLs. There are an awful lot of ways that I could construct a unique ID: using my phone number, my latitude and longitude, my Ethernet card's MAC address, the IP address served by Rogers Wave's DHCP server, my driver's license number, my Canadian Social Insurance Number, the ISBN for my book (though I think the publisher would have a moral claim to that), a domain name, or a specific URL. The problem is that you have to balance four factors: 1. ease of access (not everyone can get an ISBN easily); 2. usability (who wants to memorise MAC addresses?); 3. universality (my Canadian SIN is meaningless outside the country); and 4. persistence (the DHCP server might change my IP address in a few hours when my current lease expires). HTTP URLs win pretty close to a 10/10 on (1) and (3), about an 8/10 on (2), and probably a 6/10 or so on (4). A UUID might win on all but (2), depending on how hard it is to obtain one, but that is an inherent property of UUIDs, not of URNs -- and as I understand it, people are actually proposing constructing URNs from domain names among other schemes anyway. Even if UUIDs do turn out to be the best choice, what's the advantage of URNs? Why not just uuid:123344567773634 All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 01:21:24 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: DOM Impl: Array or Linked List? Message-ID: <002301be6b5d$5e5434e0$2ee044c6@arcot-main> Docuverse DOM SDK is implemented using the array approach with the last accessed index cached to improve next/prevSibling performance. Resulting implementation is fast for index-based access to child nodes and slightly slower for sibling-based access (only 10% slower than linked-list version). Modification to the tree is fast when appending nodes (i.e. building new tree) but is somewhat slow when inserting new nodes since array contents have to be shifted around. If your XML document has gazillion child nodes per element, performance will suffer quite a bit. You can get around the update problem by applying the Strategy pattern to child array implementation. On insert, check to see if the array is big enough to justify using different type of array implementation (i.e. sparse array). One caveat is that this tends to increase the number of child list array (smart NodeLists and NodeList implementation strategies). There are ways to minimize this problem though. So the bottom line is, you are on the right track. Don Park Docuverse -----Original Message----- From: Clark Evans To: xml-dev@ic.ac.uk Date: Wednesday, March 10, 1999 4:15 PM Subject: DOM Impl: Array or Linked List? >I've been struggling with this slightly, and would >like your feedback. I'm building a DOM tree. For >the internal representation, I see two options: > >A) A linked list for children > >* Easy inserts in middle of list >* Slower non-sequential reads > >B) An array for children > >* Harder inserts in middle of list >* Faster non-sequential reads > >Anyway, I was thinking of implementing >a compromise, a sparse array with >configurable spacing, depending upon >the document. > >Thoughts? > >Thank you. > >Clark > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 01:21:28 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002401be6b5d$5f34d0e0$2ee044c6@arcot-main> If it is 'modular' then ModularParser makes sense IMO. I believe XParser is being used by FBI for the technology that auto-detects images with adult content. Don Park Docuverse -----Original Message----- From: David Megginson To: XML Developers' List Date: Wednesday, March 10, 1999 4:52 PM Subject: Re: ModSAX: Proposed Core Features >Don Park writes: > > > BTW, it would help great in renaming if you could tell us what > > 'Mod' in ModParser stands for. > >It means that it's not a Rocker. > >Or else it means 'modular' -- I'm not sure. > > >All the best, > > >David > >-- >David Megginson david@megginson.com > http://www.megginson.com/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Mar 11 02:35:58 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:52 2004 Subject: Simplicity (was Re: Namespaces and DTDs) References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Message-ID: <36E72BDE.6798F08E@allette.com.au> Jarle Stabell wrote: > It seems to me that the SGML compatibility requirement killed simplicity. > (And gave a very confusing and hard-to-learn vocabulary) Really? I think the requirement for web compatibility made XML more complex than it looked from the outset. This is the advent of the third catchcry for XML. First it was "XML is SGML", second was "Use XML because SGML is too hard" and now "XML is very powerful, but can be difficult". Remarkably, we just now seem to be coming to the realisation that it's difficult to solve complex problems. XML seeks to do more than SGML, but it's supposed to be simpler - how can this be so? The only immediate areas of gain would have come from trimming the fat from the SGML, but the more the X*L I see, the skinner SGML looks. Yes, it is less powerful, yes it can be more proprietary, yes it is harder to write tools for, no it doesn't solve ten percent of what X*L can do before it even gets out of bed. Yes, I still use it a lot. Ponder that - SGML for simplicity. > I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent) > will be able to influence XML in a positive direction, making it simpler to > understand, use and implement. Today it's way too hard to "fully" > understand XML. It is unquestionably hard to fully understand - anyone who says that it isn't deserves a gold star - they're smarter than I am. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msharp at sybex.com Thu Mar 11 03:04:42 1999 From: msharp at sybex.com (Molly Sharp) Date: Mon Jun 7 17:09:52 2004 Subject: Delivery of XML Message-ID: <88256731.000FD31A.00@sybex.com> Hello, I'm new to the list. I'm in the computer book publishing business, and I'm looking for information about delivering XML content to customers in a secure, copy-protected (encrypted) manner. Does anyone know if there are any companies out there offering secure encryption for XML? I imagine you'd have to create a browser based on IE or Netscape that disabled functions such as view source, copy, and save as --- and that would be the only browser your encrypted XML content could be opened from. Thanks for any information about this, Molly Sharp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 03:10:00 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:52 2004 Subject: A new name for ModSax Message-ID: <146803a9.36e732d1@aol.com> Hi Everyone, Instead of XSAX or XParser (which rely on the overplayed X of extensible), how about ExSAX ExParser Which stands for the same thing. Extensible SAX Extensible Parser Is shorter to type than ModSAX. Avoids the double capital of XSAX and XParser. And is pronounced the same way. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Mar 11 03:34:23 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:52 2004 Subject: Delivery of XML In-Reply-To: <88256731.000FD31A.00@sybex.com> Message-ID: <000c01be6b6f$309a19e0$d3228018@jabr.ne.mediaone.net> > > Does anyone know if there are any companies out there offering secure > encryption for XML? I imagine you'd have to create a browser > based on IE or > Netscape that disabled functions such as view source, copy, and > save as --- > and that would be the only browser your encrypted XML content could be > opened from. > Would that be SSL with certificates to distinguish clients? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 03:42:30 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:52 2004 Subject: One more ModSax naming try... Message-ID: <8179a506.36e73846@aol.com> Hi All, Ok, while I like ExSax for the previously mentioned reasons -- I don't like its connotation for all things "Ex" like Ex-girlfriend, Ex-wife, Ex-husband... So, one other way to go is the "Add-on" theme that David expressed. XtraSax XtraParser This is a combination of "add-on", "extra" and Xml. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From landerse at du.edu Thu Mar 11 05:29:02 1999 From: landerse at du.edu (Buzz Andersen) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <0F8F007FZ0J3BS@du.edu> I would be eternally grateful if anyone out there who happens to be familiar with the Docuverse DOM SDK could tell me what is wrong with the following code. It generates a "com.docuverse.dom.DOMExceptionImpl" exception when the "appendChild" method of the document is attempted. Here it is: DOM dom = new com.docuverse.dom.DOM(); dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); Document x = dom.createDocument("e1"); Element y = x.createElement("e2"); y.appendChild(root); I would think this would generate: Am I mistaken? Thanks in advance, Buzz Andersen www.du.edu/~landerse landerse@du.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 05:59:23 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <004a01be6b84$379613b0$2ee044c6@arcot-main> Buzz, >DOM dom = new com.docuverse.dom.DOM(); >dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); >Document x = dom.createDocument("e1"); >Element y = x.createElement("e2"); >y.appendChild(root); > >I would think this would generate: > > > > I don't know what y.appendChild(root) is supposed to be but you have to insert your "e2" element into your document. // creates a document with "e1" as document element Document doc = dom.createDocument("e1"); // make sure document root exists Node e1 = doc.getDocumentElement(); if (e1 == null) e1 = doc.appendChild(doc.createElement("e1")); // create and insert e2 into e1 e1.appendChild(doc.createElement("e2")); at this point, you will have: Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From landerse at du.edu Thu Mar 11 06:41:52 1999 From: landerse at du.edu (Buzz Andersen) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <0F8F0004G3W74A@du.edu> Whoa...that was a mistranslation from my original code! It was supposed to read: x.appendChild(y); Sorry about the confusion, and thanks much for the advice/code. I've been generating XML for awhile using proprietary parser APIs, but I'm still trying to grok the whole SAX/DOM thing. Buzz Andersen www.du.edu/~landerse landerse@du.edu ---------- >From: Don Park >To: xml-dev@ic.ac.uk >Subject: Re: Help w/Docuverse DOM SDK (Please) >Date: Wed, Mar 10, 1999, 10:58 PM > >Buzz, > >>DOM dom = new com.docuverse.dom.DOM(); >>dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); >>Document x = dom.createDocument("e1"); >>Element y = x.createElement("e2"); >>y.appendChild(root); >> >>I would think this would generate: >> >> >> >> > >I don't know what y.appendChild(root) is supposed to be but you have to >insert your "e2" element into your document. > >// creates a document with "e1" as document element >Document doc = dom.createDocument("e1"); > >// make sure document root exists >Node e1 = doc.getDocumentElement(); >if (e1 == null) > e1 = doc.appendChild(doc.createElement("e1")); > >// create and insert e2 into e1 >e1.appendChild(doc.createElement("e2")); > >at this point, you will have: > > > >Best, > >Don Park >Docuverse > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on >CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Thu Mar 11 08:22:49 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:52 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK> > > Hi all, > I am using IBM's xmlj2.0.3 XML parsers. I am having the following > problem: > When i set my own document handler with a DOMParser, the > handler is never > invoked upon. However when i use the SAXParser it does. Why > does the > DOMParser not invoke the DocumentHandler yet hte SAXParser does? > The docs does not throw any light on the problem. > Is there a fundamental problem with using a DocumentHandler with a > DOMParser? > > One2One LUCIO.PICCOLI@one2one.co.uk > > > I don't know about the version of your XML4J, but in mine (1.1.9), the > documentation states that DocumentHandler is to be used with > the SAX Parser > to eb informed of parsing events. This is logical, since the > main difference is that DOM parsers parse the whole document into a resulting > DOM tree, and SAX parsers are used for event based processing. > > There doesn't appear to be any way to create a > DocumentHandler on class com.ibm.xml.parser.Parser, but you can from > org.xml.sax.DocumentHandler. >Did they change this in your newer version? I am not sure what you mean here. The Documenthandler i used was a instance of org.xml.sax.DocumentHandler. The setDocumentHandler(DocumentHandler handler) is a method on the org.xml.sax.Parser. Since all the ibm parser class implement this interface then why doesn't it work? I viewed the source code to the DOMParser and noticed that in the constructor it calls setDocumentHandler( this ). So it using itself as the document handler. Is it OK to have more than one DocumentHandler? In fact the bigger question is using a DocumentHandler on the DOMParser the correct thing to do when attempting to extract the content? -lucio > > Ed > > > Ed Howland > ed@dega.com > http://www.dega.com > "As your attorney, I advise you to take some adrenalchrome" > > -----Original Message----- > From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk] > Sent: Wednesday, March 10, 1999 11:12 AM > To: xml-dev@ic.ac.uk > Subject: DocumentHandler with xml4j DOMParser > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Thu Mar 11 09:49:32 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: Message-ID: <36E7951F.BD56A8E4@mecomnet.de> the third parameter (the DTD) is ill advised. one will, in any case, need to establish scoping rules for the bindings. such rules, in combination with xml's existing reference and sequence mechanisms, would render the third parameter either redundant or too restrictive. Marc.McDonald@Design-Intelligence.com wrote: > > For a more complete solution than the option (emphasize option) of a > DTD associated with a namespace prefix and URI, I would add the means > to declare a namespace, prefix and DTD in a DTD. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 10:12:24 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <024301be6ba6$dfdc1c00$5402a8c0@oren.capella.co.il> Bill la Forge wrote: >On a more serious note, > >I think we need a new ParserFactory... ModParserFactory? XParserFactory? >It should use ParserFactory to create a Parser and then check to see if the new >extension is supported. If not, it proceeds to wrap the parser so that it looks >like a ModParser. > >Note that this compatibility wrapper will effectively be a filter. I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread has focused on "how to access extra functionality which is already available within a particular SAX parser implementation". This might be the wrong question to ask. Shouldn't it be "how to I obtain an instance of a SAX parser which provides the features I need", instead? This is a subtle but important shift of focus. Today one can obtain an instance of a SAX parser by using the ParserFactory. Now suppose my application needs an order of a namespace aware parser, character normalization on the side, and don't spare the comments, please - how would I go around creating such a thing? Note that this issue contains the original one; one needs to be able to access the extra features. But it goes beyond it. It might also help to constrain some design choices. Take for example the issue of naming features. Today ParserFactory uses the string "org.xml.sax.parser" as an identifier for the feature "take an input source and convert it to SAX events". The format of this particular string was chosen since it is usable as a key in a properties file. Wouldn't it be reasonable to say that whichever way Mod/X/Xtra/E-ParserFactory works, it will use the same approach - that is, use Java-like package names to identify features, so that it will be possible to provide default implementations using property files? I know this would be hard for the URI camp to swallow :-) but isn't it worth it? As to the issue itself, the way I see it there is one major question to be decided first. Are the extra features independent of each other? If they aren't, we are in trouble. How do I know that pushing a filter implementing feature X on top of a parser implementing feature Y doesn't break that feature? What if one feature depends on another? Should there be a way to describe the relationship between features? How? At any rate, the goal should be some registry of "parsers" and "filters" with an appropriate API so that it would be possible to ask for a certain feature set and obtain a "parser" instance. IMVHO as far as this registry is concerned, the basic SAX events interface and the input source interface should be on equal ground with the other features. This could be a flexible framework allowing to create processing chains such as using DOM as input/output of the chain, making XSL processing a core "feature", and so on. Has anything similar been done in a different field, so we could reuse the design lessons there? It seems like a pretty generic "stream processing" problem. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 11 11:54:40 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Message-ID: <36E7AE27.7DE6@hiwaay.net> Jarle Stabell wrote: > > Richard Goerwitz wrote: > > (Simplicity _was_ one of XML's primary goals back in the dark ages last > > February.) > > It seems to me that the SGML compatibility requirement killed simplicity. > (And gave a very confusing and hard-to-learn vocabulary) Or its inventors have discovered that assuming the mission of an existing mature standard without acknowledging the complexity of that mission leads to the same or worse complexity in the invention. Darn. Maybe LISP was the right language after all and forty years of computer scientists just didn't "get it". len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 11 11:56:30 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features References: <4.1.19990311090304.00c96360@steptwo.com.au> Message-ID: <36E7AD18.486C@hiwaay.net> James Robertson wrote: > > Although I was thinking SAX eXtended, ie: > > SAXX > > This could later become > > SAXXX > > or perhaps: > > 3 > SAX At which point the local firewall chokes again, tosses up the warning message about unacceptable sites and local policies, accounts get flagged, and the whole nine yards of censorial software and American puritanism kicks in. Call it Sax++. Incrementally better. ;-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Mar 11 12:20:33 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:09:53 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E7B4B3.999188F5@mitre.org> Hi Folks, There has been a lot of discussion on this list group about namespaces and how there is no necessary link between a namespace URI and a schema (DTD). Just as I was accepting that and getting comfortable with it I read the RDF spec... For those of you unfamiliar with RDF, its mission in life is to enable you to express data about your data; i.e., metadata. You can express things like, "the creator of the BookCatalog is John Doe". "creator" is a piece of metadata about the "resource", BookCatalog. In RDF "creator" is called a "property". Thus, the "property" has the "value" John Doe ... John Doe. Okay, here's where the rub comes. Let me give you a couple of quotes from the RDF spec (the *'s I have put in and are my way of emphasizing the words that I wish for you to really focus on): "Property names *must* be associated with a schema. This can be done by qualifying the element names with a namespace prefix to unambigously *connect* the property definition with the corresponding RDF schema ..." Earlier in the spec it says: "Due to RDF's incremental extensibility, agents processing metadata will be able to trace the origins of schemata they are unfamiliar with back to known schemata and perform meaningful actions on metadata they weren't originally designed to process." Let me tell you how I interpret those two sentences. Suppose that I haver written a Web agent and it comes across a Web site that serves up an XML document containing some metadata (expressed using the RDF syntax). Let's suppose that the metadata says, in XMLese, "the creator of the BookCatalog is John Doe". My agent has never seen the property "creator", so it follows the namespace URI to the property schema. From there it finds the superclass of the creator property. If it doesn't recognize that class then it goes to its superclass. It keeps doing this until it finds a class that it understands and then it starts unwinding (presumably by this process it will be able to gain insight into what "creator" is all about. I have no idea how this will happen, but it sounds pretty cool.) This mechanism of following references until the agent gains "enlightenment" makes sense to me. I like it! ***However*** that presupposes that there is a *guaranteed* association between a namespace URI and a schema. This is totally against what this list group has worked so hard to clarify as NOT being the case. Somebody help me to understand this. Obviously I am misreading, misinterpreting the RDF spec. Thanks. /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 12:31:05 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: One more ModSax naming try... Message-ID: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> From: MikeDacon@aol.com >So, one other way to go is the "Add-on" theme that David expressed. > >XtraSax >XtraParser > >This is a combination of "add-on", "extra" and Xml. What about open? OpenParser/OpenSAX. With the new extensions, we are not constrained by the interface--its quite "open". Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 12:39:59 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002d01be6bbc$a014e460$c8a8a8c0@thing1> From: Oren Ben-Kiki >I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread >has focused on "how to access extra functionality which is already available >within a particular SAX parser implementation". This might be the wrong >question to ask. Shouldn't it be "how to I obtain an instance of a SAX >parser which provides the features I need", instead? It is interesting how small shifts in perspective can have major design implications. I just wanted to make it easy for new ModSAX applications to use older SAX parsers without requiring any extra code in the application. If ModSAX is to remain low-level, I suspect a registry is out of scope. As for building up a parser with filters to meet a set of requirements automagically, I'd rather give more control to the application to specify what it needs, than try to compose something based on a feature list. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 11 13:11:07 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6BC8.E71E5AB0@grappa.ito.tu-darmstadt.de> Oren Ben-Kiki wrote: > Has anything similar [assembling processors based on feature requests] > been done in a different field, so we could reuse the > design lessons there? It seems like a pretty generic "stream processing" > problem. I think there is an inherent assumption in this question that we are defining individual features that can be implemented by different parties and then randomly assembled to get a useful processor. While this is potentially a useful thing to do -- UNIX pipes are a good example -- it is not necessarily an easy thing to do, nor is it clear that this is a goal of ExModE-XSAX. We tried to do a similar thing in OLE DB, where database functionality would be broken down into individual services which could be assembled at will on top of a database driver. (Generally, this would be meaningful only for drivers for non-database sources, as drivers for existing databases already exposed most/all functionality.) The idea never really worked out, but here are some of the issues: * Are there enough useful features/components to make this worthwhile? For OLE DB, the answer was "probably not". We implemented a scrollable cursor (basically just a result set cache), but other ideas (transactions, security) were not easily implementable as separate layers and were not really meaningful -- anybody could get around them by excluding the layer. * What are the interfaces between components and how hard are they to implement? If you want to be able to assemble components from different vendors at will, these need to be defined. The success of SAX filters is a red herring here -- it leads one to believe that SAX can function as a useful interface for all XML-related processing features. In fact, this is not the case -- for example, whether or not to retrieve external entities has nothing to do with SAX. Thus, other interfaces would need to be defined to be able to assemble processors from third-party components. (I think this is one thing that led us astray in OLE DB. The usefulness of a scrollable cursor engine that spoke OLE DB at both ends led us to believe that the same could be done with other database features. In fact, OLE DB was less well suited or completely unsuited for other operations. In addition, it was expensive to implement.) * How independent are the features? Is it meaningful to ask for one thing but not another, such as wanting validation without namespaces (maybe) or parsing external entities (no)? Again, I think the orthogonality of some features is a red herring leading one to believe all features are orthogonal. * Are performance penalties too high to separate features into separate components? For example, suppose several features need to process XML documents as trees. While it might make sense to write a single processor for these features and toggle them within the processor, the performance hit of implementing them as separate, chained processors would be too high: each would have to build a tree, process it, and then stream it back out as SAX. * Are there order dependencies between components? For example, if you want validation and namespace processing as separate components, you had better do namespace processing first. An open question is who knows about order and how is it advertised. * Who assembles the components -- the application, the processor, or a third party? The advantage of a processor or third party (such as a factory) assembling components is that you need the assembly logic in only a few places. The disadvantage is that applications that know about a new feature cannot use that feature until the assembly logic in the processor/factory is updated. It is probably best to have a mechanism that allows both processors and applications to assemble components. My personal feeling is that assembling XML processors completely on the fly is a pipe (if you will excuse the pun) dream. The world is simply not o rthogonal enough to make this possible. Furthermore, there are too many performance gains to be had by tight integration of functionality to ever convince people to build things entirely as components with public interfaces. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 13:30:48 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <02a501be6bc2$9da75810$5402a8c0@oren.capella.co.il> Bill la Forge wrote: >If ModSAX is to remain low-level, I suspect a registry is out of scope. As for building >up a parser with filters to meet a set of requirements automagically, I'd rather give >more control to the application to specify what it needs, than try to compose >something based on a feature list. A registry might be outside the scope of ModSAX (but see below). Even if it is, I feel that we should take care that ModSAX design choices won't make such a registry unnecessarily difficult. It might also be that a "registry" is the wrong way to go; John Cowan, for example, suggested a mechanism to allow a parser to automatically push a filter between itself and the application. I'm certain there are other reasonable approaches. All I'm saying is that before we decide on ModSAX, some thought should be given to this issue. To get the ball rolling, how about the following low level solution, which would allow smarter high level solutions later on: class ModSAXRegistry { static void setClassFeatures(String className, String[] featureNames); static String[] getClassFeatures(String className); static Enumeration getFeatureClasses(String featureName); static Object newInstance(String className); } The idea being that it would be easy to get a list of classes which provide any requested feature, and check which features are implemented by a particular class. This should be trivial to implement; static code could do the registration automatically, or it could be loaded from property files, environment variables, or whatever. We already have one standard feature: "org.xml.sax.parser", to which we should probably add "org.xml.sax.filter". The question of how to build a parser implementing a particular feature would be left open. In general the application would query the registry, use whatever algorithm it likes to decide on which classes to use, instantiated them and go on as per the current ModSAX interface. Once enough experience is gained using this, we could decide to add some methods which implement popular algorithms. Compatibility with the current state: It should be trivial to implement ParserFactory above the registry. As for property files, the following scheme is safe and upward compatible with today's practice of providing the SAX parser name in "org.xml.sax.parser": org.xml.sax.class.=,,... = The whole thing is as lightweight and low-level as you can get. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 14:21:43 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:53 2004 Subject: One more ModSax naming try... Message-ID: <5f01bac8.36e7bcfd@aol.com> Hi Bill, In a message dated 3/11/99 7:27:22 AM Eastern Standard Time, b.laforge@jxml.com writes: > > What about open? OpenParser/OpenSAX. > With the new extensions, we are not constrained by the interface--its quite " > open". > I like OpenParser/OpenSAX!! Besides the open/extensible link, it gives a nod to open source which is appealing. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dkirsch at quintcom.com Thu Mar 11 14:44:15 1999 From: dkirsch at quintcom.com (dkirsch@quintcom.com) Date: Mon Jun 7 17:09:53 2004 Subject: Delivery of XML Message-ID: <88256731.00509FF2.00@mercury.quintcom.com> Molly, I understand that IBM will make a presentation at the IETF meeting next week for just this type of support. I'll see if I can get you a contact for that while I'm here at the XTECH conference today. Cheers, David K. "Molly Sharp" on 03/10/99 07:00:57 PM Please respond to "Molly Sharp" To: SGML-L@RELAY.URZ.UNI-HEIDELBERG.DE, xml-dev@ic.ac.uk cc: (bcc: David Kirsch/QCI) Subject: Delivery of XML Hello, I'm new to the list. I'm in the computer book publishing business, and I'm looking for information about delivering XML content to customers in a secure, copy-protected (encrypted) manner. Does anyone know if there are any companies out there offering secure encryption for XML? I imagine you'd have to create a browser based on IE or Netscape that disabled functions such as view source, copy, and save as --- and that would be the only browser your encrypted XML content could be opened from. Thanks for any information about this, Molly Sharp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Mar 11 15:06:37 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:53 2004 Subject: DOM Impl: Array or Linked List? Message-ID: <85256731.0052D629.00@D51MTA03.pok.ibm.com> As a contrasting point, my com.ibm.domimpl operates on the linked-list approach. I considered changing that, but decided that for the applications I anticipated folks to be writing in Java, integer indexing was going to be relatively rare compared to next and previous, and performing the additional work to maintain the indices didn't feel like it was going to be a net gain. I'm firmly convinced that there's no such thing as one best way to implement the DOM. There are too many issues to trade off which will make an implementation better at one thing than another. The fastest DOM may need more storage space for the model; the smallest model may require more code; the smallest code may be slower. Also, don't forget that the DOM is strictly an API, which can be wrapped around any model that can contain a document; there may be DOMs which are really just thin access layers for databases, for example. Pick, or write, the DOM that suits your intended application(s). Hammers make poor screwdrivers, and vice versa. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Thu Mar 11 15:22:26 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:53 2004 Subject: Java DOM Parsers Message-ID: <000001be6bd2$e0059900$01010101@development.cascade> What companies supply java DOM API's and other xml api tools? Any suggestions on which to go with? Thanks Chad Adams Payback Training Systems Email: cadams@cascadecc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 15:41:18 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: <36E7951F.BD56A8E4@mecomnet.de> Message-ID: <36E7E379.D3BE5204@goon.stg.brown.edu> James Anderson wrote (with regard to declaring namespaces in the DTD): > one will, in any case, need to establish scoping rules for the bindings That's a very insightful comment, and right on target about DTDs. But back to an earlier point a poster made about SGML-conformance (DTDs, etc.) being the thing that is killing XML: If it weren't for the promise of backwards compatibility with SGML/HTML, XML could not have gathered the initial following that it did. (Don't get me wrong; our shop is still largely an SGML shop. I'll be very sad if XML loses these connections. But I think that's where we are headed. Many people who are entering the XML community have never heard of SGML, and resent being encumbered it.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 15:56:53 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> I asked: >> Has anything similar [assembling processors based on feature requests] >> been done in a different field, so we could reuse the >> design lessons there? It seems like a pretty generic "stream processing" >> problem. Ronald Bourret wrote: >I think there is an inherent assumption in this question that we are >defining individual features that can be implemented by different parties >and then randomly assembled to get a useful processor. While this is >potentially a useful thing to do -- UNIX pipes are a good example -- it is >not necessarily an easy thing to do, nor is it clear that this is a goal of >ExModE-XSAX. Well, at least the idea warrants some serious thought. >We tried to do a similar thing in OLE DB, where database functionality >would be broken down into individual services which could be assembled at >will on top of a database driver. (Generally, this would be meaningful >only for drivers for non-database sources, as drivers for existing >databases already exposed most/all functionality.) The idea never really >worked out, but here are some of the issues: > >* Are there enough useful features/components to make this worthwhile? Good question. For SAX I'd say "probably yes". Here's a list of features (courtesy of David Megginson): > http://xml.org/sax/features/validation > Validate (true) or don't validate (false). > http://xml.org/sax/features/external-general-entities > Expand external general entities (true) or don't expand (false). > http://xml.org/sax/features/external-parameter-entities > Expand external parameter entities (true) or don't expand (false). > http://xml.org/sax/features/namespaces > Preprocess namespaces (true) or don't preprocess (false). See also > the http://xml.org/sax/properties/namespace-sep property. > http://xml.org/sax/features/normalize-text > Ensure that all consecutive text is returned in a single callback to > DocumentHandler.characters or DocumentHandler.ignorableWhitespace > (true) or explicitly do not require it (false). I'd like to see "http://xml.org/sax/features/xsl-transformation" as well. Anyway, all of the above seem to fall nicely into the pipeline framework. >* What are the interfaces between components and how hard are they to >implement? Basically the SAX callbacks, probably extended so that the full document data is available (comments and so on). This seems pretty much a done deal. >* How independent are the features? >* Are there order dependencies between components? This is a problem, as I've already pointed out. Take "normalize-text", for example. The effects of such a filter might be lost if it is followed by any of the entity expansion filters (say), not to mention an XSL one. However most of the other features seems relatively independent. I'd say this isn't a fatal problem. It definitely doesn't effect the API I suggested. >* Are performance penalties too high to separate features into separate >components? Unknown; I guess this depends on the feature and the implementation. But then, allowing one to build a system by combining filters doesn't mean one has to do so. Even inefficient pipelines are still very useful for ad-hoc processing, for prototyping systems, and so on. From the list of features above, I'd say that most won't suffer a serious penalty. >* Who assembles the components -- the application, the processor, or a >third party? What I'm suggesting is we currently answer "for now, the application", and provide a simple, lightweight, low-level API which allows it to do so. More complex solutions could evolve later on. This seems to be in the SAX spirit. >My personal feeling is that assembling XML processors completely on the fly >is a pipe (if you will excuse the pun) dream. The world is simply not o >rthogonal enough to make this possible. Furthermore, there are too many >performance gains to be had by tight integration of functionality to ever >convince people to build things entirely as components with public >interfaces. Simon St.Laurent has made a good case for layering XML functionality - see http://www.simonstl.com/articles/layering/layered.htm. The list of features above seems to validate his claims. My feeling is that pipelining is a valid approach. This is because there are quite a few features which fit this model, and each application needs its own special subset of them. If this weren't the case, we'd be designing SAX2.0 with a fixed set of features instead of ModSAX. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 16:06:16 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:53 2004 Subject: Oedipus XML (was Re: Namespaces and DTDs) In-Reply-To: <36E7AE27.7DE6@hiwaay.net> References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> Message-ID: <14055.59156.593634.998329@localhost.localdomain> len bullard writes: > Jarle Stabell wrote: > > > > Richard Goerwitz wrote: > > > (Simplicity _was_ one of XML's primary goals back in the dark ages last > > > February.) > > > > It seems to me that the SGML compatibility requirement killed simplicity. > > (And gave a very confusing and hard-to-learn vocabulary) > > Or its inventors have discovered that assuming the mission of an > existing mature standard without acknowledging the complexity of > that mission leads to the same or worse complexity in the > invention. XML has introduced some nasty new complexities, but many of those relate to providing proper Unicode support, and SGML would have had to deal with them anyway. (There were, of course, a couple of mistakes that added to the complexity, especially relating to entities and external subsets.) Speaking as both a parser writer and an application writer, I am confortable writing that XML is significantly simpler to support in enterprise-level implementations than full SGML, and that I have not actually yet really missed any of the SGML features excluded from XML. To be fair, I am talking only about the core specs -- I am comparing ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral standards on both sides. A comparison of HyTime to XLink, XPointer, and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an interesting but separate exercise. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 16:37:20 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs Message-ID: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> From: len bullard >Darn. Maybe LISP was the right language after all and forty years >of computer scientists just didn't "get it". Lisp and XML have a few things in common, like being easy to determine if they are well formed. Frankly, I think XML will be better in the long run because it can be validated against various schema. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 16:40:52 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: One more ModSax naming try... In-Reply-To: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> References: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> Message-ID: <14055.61833.969345.509241@localhost.localdomain> Bill la Forge writes: > What about open? OpenParser/OpenSAX. > With the new extensions, we are not constrained by the interface--its quite "open". Not bad, but we weren't really closed to begin with. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 16:41:45 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> Message-ID: <36E7F1DB.608598CC@goon.stg.brown.edu> "Roger L. Costello" wrote, re elements like "creator" (which may not be defined by a given DTD, but which must occur in a document instance that is using RDF): > My agent has never seen the property "creator", so it follows the > namespace URI to the property schema... Okay, so your agent is reading the document. It runs into an element in another RDF namespace. You want to use that namespace's URI component to read in additional schema information. Two problems: 1) namespace URIs don't necessarily point to schemas, and 2) if they did, you'd be extending the schema mechanism in a way that's incompatible with DTDs, as they're normally defined and understood. I don't know if its possible, from an implementation standpoint, to add the DTD after you've already started parsing the document. And if you to could, whether doing so would be reasonable. Surely this sort of problem has been discussed in the SGML community. Can someone who has hashed all the details out already perhaps post with some commentary? -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Bruce.Duffy at westgroup.com Thu Mar 11 16:47:20 1999 From: Bruce.Duffy at westgroup.com (Duffy, Bruce) Date: Mon Jun 7 17:09:54 2004 Subject: ModSAX: Proposed Core Features Message-ID: <7BA102761CAED111B27E00805FBB72333FAE4C@arrowhead.int.westgroup.com> Hi folks, One feature I'd really like to see is a Locator.getByteOffset() method. Obviously this feature would have to be optional, since not all XML inputs are indexable files. James Clark's non-SAX API for XP implements this method for startElement(), but not for the characters() callback, which unfortunately is exactly what I need it for. I could hack XP or another parser, but I'd much rather work within the context of SAX. One name for such a feature is: http://xml.org/sax/features/locator.byteOffsets (true) means getByteOffset() is supported for startElement, endElement, and character callbacks. (false) means it is not supported for those callbacks. Alternatively, if there's some reason why this feature is a Bad Idea, I'd like to know why! Thanks, Bruce Duffy West Group xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From creitzel at mediaone.net Thu Mar 11 16:58:56 1999 From: creitzel at mediaone.net (Charles Reitzel) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <199903111654.LAA04302@chmls06.mediaone.net> Ah, my favorite thing to hate about XML . Seriously, though. I have yet to hear of a single real application that needs element level prefix declarations. Not one! The PI was just fine for 99.99% of applications. The 0.01% should simply not use XML (or may need an additional layer, such as AF or a schema processor). Element declared namespaces is a solution in search of a problem. Unfortunately, namespaces have effectively killed DTD validation. My wish list for namespaces is as follows: 1) The prefix should be set by document author, *not* the DTD author. 2) The FPI should be set by the DTD author. 3) Prefixes should have document scope. 4) Namespaces should be part of XML proper and *not* an add on. 5) Element names should be resolved in the namespace of the nearest ancestor. Until most of these conditions are met, I predict the demise of DTD's. It may be too late already... Best regards, Charles Reitzel >From: james anderson >Date: Wed, 10 Mar 1999 11:49:51 +0100 >Subject: Re: Namespaces and DTDs > >That "REC-xml-names-19990114" does not provide any means to establish >prefix<->uri bindings for a DTD has long been a point of contention. A cursory >search of the archives will bear this out. The decision to eliminate the >combined prefix/uri/dtd binding (the original pi form) was, however, correct, >as the pi form, at least as proposed in "WD-xml-names-19980327", would not >have been sufficient to handle such things as a dtd which needs multiple >prefix bindings or the situation where a given prefix<->uri binding is to >apply to multiple schema sources. > >While it is true that some mechanism is necessary, a form - as discussed below >- - which effected a singular binding would also not have solved the problem. >"Everyone" would seem to be waiting for "schemas".... > >Marc.McDonald@Design-Intelligence.com wrote: >> >> A simple extension to namespaces could have fixed this problem: >> 1. Allow a DTD to be optionally specified along with the namespace >> prefix and URI >> 2. When an element is prefixed, parse it using the DTD associated with >> the namespace and the given prefix as the default. >> 3. If no DTD is associated with the prefix or not validating, do what >> is done now (ensure element is well-formed). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Mar 11 17:16:18 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <00dc01be6be1$c4587e20$0b2e249b@fileroom.Synapse> Bill la Forge wrote: >From: len bullard >>Darn. Maybe LISP was the right language after all and forty years >>of computer scientists just didn't "get it". > > >Lisp and XML have a few things in common, like being easy to >determine if they are well formed. Frankly, I think XML will be >better in the long run because it can be validated against various >schema. > LISP defines a serialization format for lists and atoms (s-expressions) which employs '(' and ')' in an analogous fashion to XML being a serialization format for trees. LISP also defines a set of rules by which lists are eval'd as functions with arguments. Aside from syntactic issues, '<' and '>' could be used as s-expression delimiters without significant change to the LISP interpreter (aside from the parsing routine). In order to properly compare LISP with XML, then, we would need to propose a set of rules whereby *x-expressions* were evaluated. The closest we have today is XSL which is not currently a fair comparison to LISP (e.g. try writing a compiler or word processor in XSL :-)) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 11 18:03:04 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:54 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6BF1.B0427BB0@grappa.ito.tu-darmstadt.de> Oren Ben-Kiki wrote: > >* What are the interfaces between components and how hard are they to > >implement? > > Basically the SAX callbacks, probably extended so that the full document > data is available (comments and so on). This seems pretty much a done deal. and also wrote: > >* Who assembles the components -- the application, the processor, or a > >third party? > > What I'm suggesting is we currently answer "for now, the application", and > provide a simple, lightweight, low-level API which allows it to do so. More > complex solutions could evolve later on. This seems to be in the SAX spirit. If the application assembles the components and the interface between them is SAX, what do we need that SAX filters don't already give us? In other words, does anything need to be done to OpenSAX (best name so far) to support this besides adding the ParserFilter interface? The other question that occurs to me is how useful/common it is to dynamically assemble a processor at run time. That is, are there really applications (outside of test environments) that allow the user to designate their parser at run time (or even installation time) and therefore need to cover any possible deficiencies in the chosen parser? What is gained by allowing the user to choose the parser? Note that this is a very different situation from, say, using different ODBC drivers. In the case of ODBC drivers, you are choosing a different source of data (type of database) and application writers have a strong incentive to support multiple databases through ODBC. In the case of XML, the source of data is always the same XML document and the choice of parser becomes a trade-off between speed, reliability, feature-set, etc. Since the application writer knows the feature set ahead of time, why not just hard-code the required parser and SAX filters and be done with it? (Yes, I know that "hard-code" is a bad word and I shudder as a write it, but I really am curious if anybody out there has a real-world application that allows users to change parsers and what the benefits of this are besides the ability to say, "Oh, look. I'm using a different parser.") In this view, the utility of SAX is not the ability to change parsers at run time, but to change them over time as reliability, speed, size, etc. of the parsers change. It also means that application writers can learn a single interface (SAX) and then choose parsers as they are appropriate to the application without having to learn different interfaces for different parsers. The ability to request features in OpenSAX allows the application to request processor behavior, which is slightly different from assembling a suitable parser. For example, if I have an application that doesn't need validation, but I the parser I want to use does validation by default, I would like to be able to turn that off. Just to be clear, I'm not necessarily against assembling processors based on a feature set. I just believe that it is far more complex than it appears at first glance and am not convinced that it's worth the trouble. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 18:03:35 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:54 2004 Subject: Opening SAX for better filter support Message-ID: <007701be6be9$f9bfc840$46026982@thing1.camb.opengroup.org> A fixed API has lots of advantages in terms of service/user. Each can be implemented to the API without being bound to the other. And if you do need a non-standard feature, you isolate the code that has such a dependency. Overall, a very manageable situation unless you move too far out of scope of the API. Introduce middleware and everything changes. Now you want an open API that permits unanticipated interactions between the service/user without needing to completely bypass the middleware. With the advent of SAX filters, we have now moved to having a need for a more open API, and David's proposal seems to fit that need precisely. Consider a complex of stacked and nested filters wrapping a parser. This composition is something which might be best done separately from the application itself, but the application may still need to access various parts. Indeed, a good design would keep as much of the application as possible independent of any particular structure, as the structure may need to change if we change parsers or introduce more appropriate filters. Think of this complex of parser and filters as some kind of aggregate that is best treated as a gray box by the application--the application may need to identify and interact with various parts of the aggregate, but doesn't know the overall structure. The new get and set methods are exactly what we need. We can present a named object to the aggregate and, by routing the request through the aggregate, the component which knows what to do with that object can process it. Conversely, we can request a reference to a component or result by name and the appropriate component is able to respond. Now while not of this may be terribly efficient, it doesn't need to be-- these are calls that are made for configuration or to access results. So it should work and work beautifully. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 18:29:48 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <00c501be6bed$96b33260$46026982@thing1.camb.opengroup.org> From: Jonathan Borden > The closest we have today is XSL which is not currently a fair >comparison to LISP (e.g. try writing a compiler or word processor in XSL >:-)) I like to use XML to do compositions of components, which encompases the declaritive rather than the proceedural aspects of programming. What I like is that a schema can then validate a composition, allowing clients to send a composition to a server to construct an agent, but without the security problems that you would otherwise have. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 11 18:36:24 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> Message-ID: <36E80E9C.FC04B18@locke.ccil.org> Roger L. Costello wrote: > Okay, here's where the rub comes. Let me give you a couple of quotes > from the RDF spec (the *'s I have put in and are my way of emphasizing > the words that I wish for you to really focus on): "Property names > *must* be associated with a schema. This can be done by qualifying the > element names with a namespace prefix to unambigously *connect* the > property definition with the corresponding RDF schema ..." Watch the modal verbs! Property names *must* be associated with a schema, but this can (i.e. *may*) be done by making the URI to which the namespace prefix is bound the actual URI of the schema document. There may be other ways to do it. Besides, RDF is free to set tighter limits on the URIs used to identify namespaces than XML in general. XML-based standards can always set extra requirements, like the SMIL requirement (clause 5.1) that there be no internal DTD subset in SMIL documents. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 11 18:53:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> <36E7F1DB.608598CC@goon.stg.brown.edu> Message-ID: <36E812D1.CE73AEE7@locke.ccil.org> Richard L. Goerwitz wrote: > Okay, so your agent is reading the document. It runs into an element > in another RDF namespace. You want to use that namespace's URI component > to read in additional schema information. > > Two problems: 1) namespace URIs don't necessarily point to schemas, and > 2) if they did, you'd be extending the schema mechanism in a way that's > incompatible with DTDs, as they're normally defined and understood. RDF namespace declarations *may* (and even perhaps should) point to RDF schemas, which are not XML schemas at all. They declare RDF classes and properties, not XML elements and attributes. Both RDF statements and RDF schemas are normally represented in XML, but other representations (graphical) also exist. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 19:27:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: Fw: ModSAX: Proposed Core Features In-Reply-To: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> References: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> Message-ID: <14056.6147.757421.124783@localhost.localdomain> Oren Ben-Kiki writes: > I'd like to see "http://xml.org/sax/features/xsl-transformation" as > well. Anyway, all of the above seem to fall nicely into the > pipeline framework. How about "http://capella.co.il/~oren/sax/features/xsl-transformation" (or whatever is suitable for your web rights)? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 19:28:45 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:54 2004 Subject: FW: Namespaces and DTDs Message-ID: Hi I am using these simple rule of thumb: a) a XML DTD is useful for XML editors not for XML renderers b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) c) a XML interpreter do not need a DTD (something else than rendition) If I need a DTD at the receiving end, then I am now no longer in the XML world but in the SGML world because the receiving end needs a validating parser. Several SGML parser like for instance SP can parse XML simplifyed DTD. The only simplification I gained is the -- or -0 think called omitags. Therefore, because I have to include a DTD for validation, better use then a SGML format. However, on the Web, to reduce complexity, I should not assume that the receiving end has a validating parser. Thus, because my XML document has been validated with my XML editor or by any other validation program. The receiving end makes the reasonnable assumption that if the docuement is a XML docuement it is "well formed" and valid. Its a lot simplier that way. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 19:31:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs In-Reply-To: <199903111654.LAA04302@chmls06.mediaone.net> References: <199903111654.LAA04302@chmls06.mediaone.net> Message-ID: <14056.6319.57210.877490@localhost.localdomain> Charles Reitzel writes: > Seriously, though. I have yet to hear of a single real application > that needs element level prefix declarations. Not one! I'll paraphrase the use case as follows (I'll leave the source anonymous): A server wants to construct a large XML document as the response to a client request, and it does so by handing off the work to several parallel processes and then concatenating the results into a single document. If each of the processes can declare its own namespaces, then it is not necessary to establish complicated negotiation channels between the top-level process and the child processes to obtain the correct namespace declarations. Before everyone rushes out to shoot holes in this use case, I'd like to note that I still have callouses on my trigger finger from doing so myself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 19:57:40 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs In-Reply-To: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> Message-ID: HI Bill, Lisp and XML have a few things in common, like being easy to determine if they are well formed. Frankly, I think XML will be better in the long run because it can be validated against various schema. I am not sure of that. a) a Lisp document could be made SGML compliant because SGML can let you define begin and end tag's delimiters (Ex: dsssl). b) if the previous proposition is true, then you can also change the delimiters and keep the structural coherency. c) You could also enforce that a begin and end tag conform to the well formed constraint. d) a XML document is a hierarchy and a hisrarchy could be mapped with list constructs. In fact, as soon as you map lisp to SGML and then to XML, you notice immediately the similarities. There is formal transformation possible from one structure to the other. In mathematical term would coud talk of "topological" transformation from one to the other. Their structure are similar enough to transform one into the other. Conclusion: we should not take what Jonathan said so lightly and do some homework fisrt. This said, I agree that XML could potentially be more succesful than lisp or SGML or (fill here less than popular good ideas) but this is for other reasons than technical reasons. For instance, this could be very popular because the web is popular and XML benefit form the aura effect. Also because, important software manufacturer are behind it and put compliant products on the market. Also because poeple don't want to miss the next Web big success, etc... This has nothing to do with technical vertues but more with marketing vertues. But surely not because XML is bettern than lisp because it could be validated against different schemas. a) XML has the advantage, because of its strict syntax (compared to SGML omitags) that a receiver do not need to validate the structure to interpret the XML document. In fact, there is a high probability that interpreters would "hard code" in some ways what to do for each element and this without the need of a DTD. (except for style language that will "hard code" tree manipulation and formatting object model) b) If a DTD is necessary why not use SGML except for a marketing advantage then? c) An otehr usage of XML is to separate the content from the rendition. In this case, most of browsers' style engine won't contain a validating parser and therefore validation mechanism is irrelevant. Conclusion: XML will be better simply because it has marketing momentum not because of its technical merits period. The whole difference between SGML and XML is that the receiver do not necessarily need validation to interpret the document (because of the "well formed" constraint). But from the marketing point of view it has huge advantage. New domain languages could be created and big software manufacturers could again regain some control by creating a domain language and let the numbers create a de facto standard. In fact, HTML by being a standard domain language is more a threat to big manufacturer than XML is. So, if XML is to be more popular this is surely for marketing reasons :-) Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 20:27:45 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:55 2004 Subject: RDF not conforming to the Namespace spec? In-Reply-To: <36E7F1DB.608598CC@goon.stg.brown.edu> Message-ID: Hi Okay, so your agent is reading the document. It runs into an element in another RDF namespace. You want to use that namespace's URI component to read in additional schema information. Two problems: 1) namespace URIs don't necessarily point to schemas, and 2) if they did, you'd be extending the schema mechanism in a way that's incompatible with DTDs, as they're normally defined and understood. I don't know if its possible, from an implementation standpoint, to add the DTD after you've already started parsing the document. And if you to could, whether doing so would be reasonable. Surely this sort of problem has been discussed in the SGML community. Can someone who has hashed all the details out already perhaps post with some commentary? You're right on this. For RDF the validation mechanism is not on the name space. The name space mechanism form the receiver point of view is to be seen as a way to prevent name collision in the same document space (including document linked to the document). A simple parser could then process the complete markup name as a whole word (i.e.. MySameSpace:MymarkupName). It could occur however that two name space would collide (i.e. two name space have the same name space id and the same markup name) then in this case, the parser may not take any chance and replace the name space ID by the URI (if the URI is unique) and be sure that now the element name is unique (i.e. :MyMarkupName). The whole thing is to be sure that we do not have name collision in the document name space (I mean here the document complete set of names). For RDF the property list is defined by a schema. RDF is like directory service schemas. a) you have to define a record or property set with a schema. You also define entities relationship with the schema. The parser do not have to use a DTD as a validation mechanism just the trick to replace the name space ID by the URI if we want to reduce name collision to near zero probability. However this is not a validation mechanism this is a name space collision resolution mechanism like for instance used in languages like C++ (practically, you replace the name space ID by the URI to create a unique name element, not more not less -> MyNSID:MyElementName into http://www.netfolder.com/:MyElementName This is now a very low probability that a linked document would contain the same named element.) This is for the parsing side. Now for the interpretation side, a RDF interpreter (that uses a XML parser) has to know the object's property set to do something on it. This something could be to build a "frame" for this object. A frame, to recall, is like a record. This frame could be strongly typed by a schema that says what the frame is allowed to contain and what relationship it has with other frames. A schema is not a DTD because the validation is not at the syntax level but at the interpretation level. Let's take an example: we want to import data into a directory service and to do so, we use RDF. To be sure that the XML parser won't have any name collision we could use name space otherwise if the document name space is controlled the usage of name spaces is superfluous. Thus let have a directory record for a user on a network. Albert Einstein etc.... The XML parser has enough to do its job but the RDF interpreter now needs to know what is the "frame" schema or object category constraints. Thus, the RDF interpreter can ask the XML parser to parse the xml based schema document to know the "frame" constraints. After the parsing done, it can compare each frame property with the schema to know if the "frame" is valid of not. It could also add a new schema to the directory service if the object category is new to the directory. Conclusion: The schema stuff is useful for the interpreter not the syntax parser which in this case is the XML parser. We have to keep in mind that XML is for the syntax and other mechanism may have to be provided to the syntax parser client: the interpreter. A RDF interpreter then use XML parser to convert into a structure than could be manipulated by the parser: a) the RDF document b) the schemas then "interpret" what to like for instance import data into a directory service. A XML document is like a sleeping beauty without an interpreter :-) Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 20:33:41 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs References: <199903111654.LAA04302@chmls06.mediaone.net> <14056.6319.57210.877490@localhost.localdomain> Message-ID: <36E8285F.DEB5CE9@goon.stg.brown.edu> David Megginson wrote, re why namespaces are needed: > I'll paraphrase the use case as follows (I'll leave the source > anonymous): > > A server wants to construct a large XML document as the response to > a client request, and it does so by handing off the work to several > parallel processes and then concatenating the results into a single > document. If each of the processes can declare its own namespaces, > then it is not necessary to establish complicated negotiation > channels between the top-level process and the child processes to > obtain the correct namespace declarations. Maybe it's just me, but this sort of statement would have more credi- bility if there were more evidence of widespread practical application of this technique. Most successful standards are based, in large part, on experience and wisdom people gain from actually doing a thing. A lot. Am I missing something here? -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Thu Mar 11 20:34:30 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs In-Reply-To: Message-ID: <199903112033.PAA24971@ruby.ora.com> [Didier PH Martin] > a) a Lisp document could be made SGML compliant because SGML can let > you define begin and end tag's delimiters (Ex: dsssl). I think there's a little confusion here about DSSSL. DSSSL stylesheets are SGML documents, but they usually use angle-brackets: (default (make sequence)) The parentheses are only character data. I don't think that Lisp could be made SGML compliant; the delimiters could be redefined, but as Steve DeRose notes in _The SGML FAQ Book_, there are some limits to the flexibility of the redefinitions, since some delimiter roles are overloaded. Also, Lisp doesn't have the equivalent of start-tag close, and you can only omit tagc if the next character is stago or etago (ISO 8879:1986, clause 7.4.1.2) which it wouldn't be when you get to the leaves of a structure. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Thu Mar 11 22:03:18 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:55 2004 Subject: Java DOM Parsers Message-ID: <008901be6c09$465da400$0300000a@othniel.cygnus.uwa.edu.au> >What companies supply java DOM API's and other xml api tools? Any >suggestions on which to go with? I'll leave others to suggest which to go with. But for a list, see: http://www.xmlsoftware.com/utilities/ http://www.xmlsoftware.com/parsers/ James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Thu Mar 11 22:20:05 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs Message-ID: So make a namespace declaration a PI and add an "not using this namespace anymore" PI. Then use simple occurrence scoping: Process result: .... Process gets to define the prefixes that override any previous definition, old definition (if any) restored by XMLENDNS. No problem with concatenation. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: David Megginson [SMTP:david@megginson.com] Sent: Thursday, March 11, 1999 11:30 AM To: XML Developers' List Subject: Re: Namespaces and DTDs Charles Reitzel writes: > Seriously, though. I have yet to hear of a single real application > that needs element level prefix declarations. Not one! I'll paraphrase the use case as follows (I'll leave the source anonymous): A server wants to construct a large XML document as the response to a client request, and it does so by handing off the work to several parallel processes and then concatenating the results into a single document. If each of the processes can declare its own namespaces, then it is not necessary to establish complicated negotiation channels between the top-level process and the child processes to obtain the correct namespace declarations. Before everyone rushes out to shoot holes in this use case, I'd like to note that I still have callouses on my trigger finger from doing so myself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Mar 12 00:44:34 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:55 2004 Subject: DocumentHandler with xml4j DOMParser In-Reply-To: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK> Message-ID: <000301be6c20$a6e177e0$d3228018@jabr.ne.mediaone.net> Perhaps the confusion is this: A DocumentHandler is a SAX concept, not a DOM concept. The DOMParser contains a DocumentHandler that builds a DOM tree from the source document. If you are working with the DOM, then you will parse the document and then access its members through the DOM interfaces. If you would rather process using an event based interface, then use SAX directly i.e. the SAXParser. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Fri Mar 12 02:42:04 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs In-Reply-To: Your message of "Thu, 11 Mar 1999 14:19:06 -0800." Message-ID: <199903120241.PAA10692@aniwa.sky> Documents resulting from queries run on the concatenated document would tend to cause problems, as query results don't generally return the context of the XML elements returned. This problem also applies to queries run across multiple documents unless their DTD's are identical, which perhaps suggests that an answer to this problem has to come from the query languages. Andrew McNaughton Marc.McDonald@Design-Intelligence.com wrote: > So make a namespace declaration a PI and add an "not using this > namespace anymore" PI. Then use simple occurrence scoping: > > Process result: > > .... > > > Process gets to define the prefixes that override any previous > definition, old definition (if any) restored by XMLENDNS. No problem > with concatenation. > > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: David Megginson [SMTP:david@megginson.com] > Sent: Thursday, March 11, 1999 11:30 AM > To: XML Developers' List > Subject: Re: Namespaces and DTDs > > Charles Reitzel writes: > > > Seriously, though. I have yet to hear of a single real > application > > that needs element level prefix declarations. Not one! > > I'll paraphrase the use case as follows (I'll leave the source > anonymous): > > A server wants to construct a large XML document as the response to > a client request, and it does so by handing off the work to several > parallel processes and then concatenating the results into a single > document. If each of the processes can declare its own namespaces, > then it is not necessary to establish complicated negotiation > channels between the top-level process and the child processes to > obtain the correct namespace declarations. > > Before everyone rushes out to shoot holes in this use case, I'd like > to note that I still have callouses on my trigger finger from doing > so > myself. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:13:49 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: Oedipus XML (TIe Your Mother Down) References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> <14055.59156.593634.998329@localhost.localdomain> Message-ID: <36E8A1E8.1B@hiwaay.net> David Megginson wrote: > > Speaking as both a parser writer and an application writer, I am > confortable writing that XML is significantly simpler to support in > enterprise-level implementations than full SGML, and that I have not > actually yet really missed any of the SGML features excluded from > XML. I agree with this. As an application writer who only has to parse parts of it and then in the context of using a relational system with XML editing, it looks very much the same to me as the simple features of SGML that I've always used. In effect, much of the nastier bits of SGML I did not use before. So, it looks much the same. It is a lot of fun to tie the treeviews, browser objects, tables, dialogs, combo boxes, etc. together into a generalized knowledge management system. Cheap too. ;-) > To be fair, I am talking only about the core specs -- I am comparing > ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral > standards on both sides. A comparison of HyTime to XLink, XPointer, > and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an > interesting but separate exercise. Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, the RDF Dublin Core, and namespaces influence my work. Learning to think beyond the DTD to the information properties of the metalanguage proves to be very useful and that is not something I did before. As activities like X3D ramp up, I find I am applying more and more of the wall-to-wall markup concepts from the middle years of SGML and they work in the XML infrastructure of tools. This is actually quite delightful. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:19:29 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs References: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> Message-ID: <36E8A33A.43BB@hiwaay.net> Bill la Forge wrote: > > From: len bullard > >Darn. Maybe LISP was the right language after all and forty years > >of computer scientists just didn't "get it". > > Lisp and XML have a few things in common, like being easy to > determine if they are well formed. Frankly, I think XML will be > better in the long run because it can be validated against various > schema. As much as I resisted it in the early working groups for various reasons, I find myself agreeing with the position that it is good to have formal definitions for both wrll-formed and validated information. I had worked in that mode in the IDE/AS, IADS and GE systems, but the notion wasn't formally expressed. I like ISO 8879 DTDs mainly because they are for me, much easier to read and use to parse in my head. As I implement more with relational systems and use the tables to store the property sets of both schemata, properties of schemata as well as instances, I think I have more insight now into why people want multiple schema types even without the obvious extensions such as inheritance. nodes is nodes is nodes. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:26:40 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: FW: Namespaces and DTDs References: Message-ID: <36E8A4EB.375F@hiwaay.net> Didier PH Martin wrote: > > Hi > > I am using these simple rule of thumb: > > a) a XML DTD is useful for XML editors not for XML renderers > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) > > If I need a DTD at the receiving end, then I am now no longer in the XML > world but in the SGML world because the receiving end needs a validating > parser. Several SGML parser like for instance SP can parse XML simplifyed > DTD. The only simplification I gained is the -- or -0 think called omitags. > Therefore, because I have to include a DTD for validation, better use then a > SGML format. > > However, on the Web, to reduce complexity, I should not assume that the > receiving end has a validating parser. Thus, because my XML document has > been validated with my XML editor or by any other validation program. The > receiving end makes the reasonnable assumption that if the docuement is a > XML docuement it is "well formed" and valid. That's mostly true because web documents don't stick around. In cases where information is moving across multiple processes or sits in some long term archival, it is very handy to be able to validate it on the receiving end. This will become more apparent to the XML community when they get to do the sort of work the SGML community did a decade after the first SGML applications fielded instances. Things change. Finding those changes quickly is the key to cheap rehosting. In my experience, if DTDs die, someone gets to reinvent them and it will be painful. Otherwise, yes, the DTD is much more useful in the editor in the initial part of the information lifecycle. len > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 12 08:29:31 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs Message-ID: <01BE6C6A.B2052F50@grappa.ito.tu-darmstadt.de> Didier PH Martin wrote: > a) a XML DTD is useful for XML editors not for XML renderers > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) (c) is not always true because DTDs are used for more things than just validation. For example, DTDs are used to define internal general entities, attribute defaults, and attribute types. (The latter is important, for example, if a processor expects to build links based on ID/IDREF attributes or process according to notations.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Fri Mar 12 08:44:10 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:55 2004 Subject: XML DATA Message-ID: <3601bf97.120299@smtpgate1.ONE2ONE.CO.UK> Hi all, I would like to know the status of XML Data proposal and it's take up in the XML community. Currently the only XML data parser that i found is from MS. Does anyone else plan on supporting XML Data in the future? If not, what is the alternative to XML DATA ? adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 12 09:16:47 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:55 2004 Subject: XML DATA Message-ID: <01BE6C71.4DE32A70@grappa.ito.tu-darmstadt.de> LUCIO PICOLLI wrote: > I would like to know the status of XML Data proposal and it's take up in > the XML community. Currently the only XML data parser that i found is > from MS. Does anyone else plan on supporting XML Data in the future? > If not, what is the alternative to XML DATA ? Outside of the Microsoft parser, XML Data is probably dead. There are three other schema proposals (SOX, DCD, and DDML) and the W3C is currently working on their own. XML Data is significant in that it seems to be the only schema language that is publicly supported by a parser. You can find the various schema language specs at: SOX: http://www.w3.org/TR/NOTE-SOX/ DCD: http://www.w3.org/TR/NOTE-dcd DDML: http://www.w3.org/TR/NOTE-ddml XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/ W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req and an overview of the various schema languages at: http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ index.htm OR http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ XMLSchemas.ppt -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Mar 12 09:19:58 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:55 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <00f501be6c68$bd213020$5402a8c0@oren.capella.co.il> Ronald Bourret wrote: >If the application assembles the components and the interface between them >is SAX, what do we need that SAX filters don't already give us? In other >words, does anything need to be done to OpenSAX (best name so far) to >support this besides adding the ParserFilter interface? Yes. One needs to _locate_ the necessary filters. Hence the registry, the query-for-a-feature, etc. >The other question that occurs to me is how useful/common it is to >dynamically assemble a processor at run time. That is, are there really >applications (outside of test environments) that allow the user to >designate their parser at run time (or even installation time) and >therefore need to cover any possible deficiencies in the chosen parser? > What is gained by allowing the user to choose the parser? If there aren't, why bother with ModSAX at all? If I know exactly which class is used, I also know exactly which features it provides, right? The whole point of ModSAX is that this isn't the case. Think of it like this: XML support is not the same on all platforms. Sometimes there's a built-in SAX parser. It may or may not support some features. Sometimes there's an XSL processor. And so on. I'm talking about platforms existing today, or "real soon now" - IE5, server packages, etc. I want to write code which is _reasonably_ portable to such platforms. I accept the remark that a full-scale solution is beyond the scope of ModSAX. What I suggested is an interface in the spirit of SAX (I hope) - lightweight, simple, low-level, which allows future layering of higher-level solutions. >Note that this is a very different situation from, say, using different >ODBC drivers. In the case of ODBC drivers, you are choosing a different >source of data (type of database) and application writers have a strong >incentive to support multiple databases through ODBC. In the case of XML, >the source of data is always the same XML document and the choice of parser >becomes a trade-off between speed, reliability, feature-set, etc. On the contrary, I see it as vbeing very similar to using ODBC drivers. ODBC drivers vary in their capabilities, and therefore have a mechanism for querying for particular features. So do XML components. There might be any number of ODBC drivers available in a particular system. Same for XML components. And you typically have a pretty good idea of which ODBC driver you are going to use. Same for XML components. The last point doesn't invalidate the first two. BTW, have you ever tried to write a non trivial program which would work with any ODBC driver? I have. You have to at least negotiate its capabilities, find a match for your needs, and then the problems start - it doesn't like this join syntax, it can't do this particular form of query... You end up writing an adapter class which knows the particular nastiness of the particular driver. Of course this is due to SQL being such a weak standard; XML should be better in this regard - if we insist on well-defining features, that is. >Since the application writer knows the feature set ahead of time, why not >just hard-code the required parser and SAX filters and be done with it? > (Yes, I know that "hard-code" is a bad word and I shudder as a write it, >but I really am curious if anybody out there has a real-world application >that allows users to change parsers and what the benefits of this are >besides the ability to say, "Oh, look. I'm using a different parser.") Mine. I run on both IE5 ("hey, look, there's a built in XSL processor") and IE4 ("oh well, let's use XT"), not to mention some server platforms I'm considering. I'm also tentatively considering other XML features - namespaces and embedding. I doubt I'm unique in this regard. And as XML support starts crawling into popular platforms (examples abound), this would become more and more common. At least we hope so :-) >In this view, the utility of SAX is not the ability to change parsers at >run time, but to change them over time as reliability, speed, size, etc. of >the parsers change. It also means that application writers can learn a >single interface (SAX) and then choose parsers as they are appropriate to >the application without having to learn different interfaces for different >parsers. That's one view and a valid one. It shouldn't prevent the other one. >The ability to request features in OpenSAX allows the application to >request processor behavior, which is slightly different from assembling a >suitable parser. For example, if I have an application that doesn't need >validation, but I the parser I want to use does validation by default, I >would like to be able to turn that off. Right. I didn't suggest that the original question ("which features are supported") isn't important. What I suggested is that the second question ("how do I find a filter/parser which does X") is also important. If it wasn't, why do we have a ParserFactory class in SAX? BTW, I'm not happy with this "parser" fixation. SAX is an interface which allows processing an XML tree. I don't see why the special case ("input: text; output: SAX events") is any different then "input: DOM; output: SAX events", for example. That's why "org.xml.sax.parser" is just another "feature" in the API I suggested. "org.xml.sax.visitor" and "org.xml.sax.builder" would be on equal grounds. IMVHO, converting DOM to SAX and back is something which we will have to deal with. >Just to be clear, I'm not necessarily against assembling processors based >on a feature set. I just believe that it is far more complex than it >appears at first glance and am not convinced that it's worth the trouble. I think I've answered the complexity issue - the API I've suggested is anything but. It merely provides the basic building blocks. The application may be as complex or as simple as you want. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Mar 12 09:28:26 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:55 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <00fa01be6c69$e7a44160$5402a8c0@oren.capella.co.il> David Megginson wrote: >I wrote: > > I'd like to see "http://xml.org/sax/features/xsl-transformation" as > > well. Anyway, all of the above seem to fall nicely into the > > pipeline framework. > >How about "http://capella.co.il/~oren/sax/features/xsl-transformation" >(or whatever is suitable for your web rights)? I kind of doubt that any XSL processors would register themselves under this name :-) I don't think that implementations of standard W3C features should be under private names. Come to think of it, if we go the URI way (which I'm not happy with since it can't be used as a property name), the "right" URI is a pointer to the relevant W3C standard. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Mar 12 09:51:35 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:56 2004 Subject: FW: Namespaces and DTDs References: Message-ID: <36E8E733.7F84EA5E@mecomnet.de> Didier PH Martin wrote: > > I am using these simple rule of thumb: > > a) a XML DTD is useful for XML editors not for XML renderers if one presumes this, then one loses the ability to use attribute defaults and, thereby, for example, the chance to use "architectural" techniques. > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) > > If I need a DTD at the receiving end, then I am now no longer in the XML > world but in the SGML world because the receiving end needs a validating > parser. these techniques do note presume validation, just the availability of attribute declarations. > Several SGML parser like for instance SP can parse XML simplifyed > DTD. The only simplification I gained is the -- or -0 think called omitags. > Therefore, because I have to include a DTD for validation, better use then a > SGML format. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Mar 12 10:01:05 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:56 2004 Subject: Namespaces and DTDs References: <199903120241.PAA10692@aniwa.sky> Message-ID: <36E8E947.5B6EF6B7@mecomnet.de> of all things, the "context problems" would be minimal, so long as the decoding process maps the prefixed identifiers to universal identifiers. (which as best i can surmis all the "standard" parsers do.) the application wouldn't care where they came from and any reserialization would be responsible to get its own declarations in order. [i'm not arguing for this declaration form, just noting that it doesn't make the problem any more complex.] Andrew McNaughton wrote: > > Documents resulting from queries run on the concatenated document would tend > to cause problems, as query results don't generally return the context of the > XML elements returned. This problem also applies to queries run across > multiple documents unless their DTD's are identical, which perhaps suggests > that an answer to this problem has to come from the query languages. > > Andrew McNaughton > > Marc.McDonald@Design-Intelligence.com wrote: > > So make a namespace declaration a PI and add an "not using this > > namespace anymore" PI. Then use simple occurrence scoping: > > > > Process result: > > > > .... > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Fri Mar 12 10:53:41 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:56 2004 Subject: XML DATA Message-ID: <3601c191.120299@smtpgate1.ONE2ONE.CO.UK> Thanks for info below Ronald. I am very interested in the SOX schema. I guess i'll might as asked a similar question as before. What is the take up of SOX in the XML community? What do other people use to mapping native data type in XML? -lucio > > I would like to know the status of XML Data proposal and > it's take up in > > the XML community. Currently the only XML data parser that > i found is > > from MS. Does anyone else plan on supporting XML Data in the future? > > If not, what is the alternative to XML DATA ? > > Outside of the Microsoft parser, XML Data is probably dead. > There are > three other schema proposals (SOX, DCD, and DDML) and the W3C > is currently > working on their own. XML Data is significant in that it > seems to be the > only schema language that is publicly supported by a parser. > > You can find the various schema language specs at: > > SOX: http://www.w3.org/TR/NOTE-SOX/ > DCD: http://www.w3.org/TR/NOTE-dcd > DDML: http://www.w3.org/TR/NOTE-ddml > XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/ > W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req > > and an overview of the various schema languages at: > http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ index.htm OR http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ XMLSchemas.ppt -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 11:38:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Oedipus XML (TIe Your Mother Down) In-Reply-To: <36E8A1E8.1B@hiwaay.net> References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> <14055.59156.593634.998329@localhost.localdomain> <36E8A1E8.1B@hiwaay.net> Message-ID: <14056.64392.235652.11594@localhost.localdomain> len bullard writes: > > To be fair, I am talking only about the core specs -- I am comparing > > ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral > > standards on both sides. A comparison of HyTime to XLink, XPointer, > > and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an > > interesting but separate exercise. > > Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, > the RDF Dublin Core, and namespaces influence my work. Learning to > think beyond the DTD to the information properties of the metalanguage > proves to be very useful and that is not something I did before. > As activities like X3D ramp up, I find I am applying more and more > of the wall-to-wall markup concepts from the middle years of SGML > and they work in the XML infrastructure of tools. This is actually > quite delightful. Precisely. The important point, though, is that none of these peripheral standards is hard-coded to SGML or XML. Although there are some minor lexical differences, in general you could apply Namespaces to SGML or HyTime to XML, you could use XSL to format an SGML document or DSSSL to format an XML document, etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rudman at idetix.com Fri Mar 12 14:02:05 1999 From: rudman at idetix.com (Dan Rudman) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <000701be6c90$8e31ee80$49e9fdce@diablo.idetix.com> I apologize for the basic question in advance :) With the wealth of XML libraries available, I am more and more inclined to make use of these libraries to help me create, parse, and utilize my own tag markup language to be embedded within an HTML document. My understanding of XML at this point is that it must be well-formed or a fatal error occurs. If this is the case, how can I deal with the fact that most HTML documents are NOT well-formed and that most HTML design tools do not enforce, require, or even sometimes support, well-formedness in a document? Things would be rosy if I didn't have to rely on HTML, but my application requires it. Thanks. -- Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990312/525211ab/attachment.htm From david at megginson.com Fri Mar 12 14:06:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12 Message-ID: <14057.7534.612787.424789@localhost.localdomain> Here's a revised list of the proposed core features for Mod??SAX. I've added one new feature -- http://xml.org/sax/features/use-locator -- which will explicitly request the parse to supply or not to supply a Locator through the DocumentHandler.setDocumentLocator callback. There are two possible advantages to including this feature: 1. If the application wants a locator, it can tell before beginning the parse whether the parser can supply one. 2. If the application does not want a locator, the SAX parser/driver might be able to operate more efficiently if it doesn't have to maintain the Locator information. What does everyone else think? In any case, here's the revised core feature list (I've also added extra wording to make it clear that the external DTD subset counts as an external parameter entity): ModSAX Core Features -------------------- $Id: features.txt,v 1.1 1999/03/12 13:57:54 david Exp $ http://xml.org/sax/features/validation Validate (true) or don't validate (false). http://xml.org/sax/features/external-general-entities Expand external general entities (true) or don't expand (false). http://xml.org/sax/features/external-parameter-entities Expand external parameter entities including the external DTD subset (true) or don't expand (false). http://xml.org/sax/features/namespaces Preprocess namespaces (true) or don't preprocess (false). See also the http://xml.org/sax/properties/namespace-sep property. http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). http://xml.org/sax/features/use-locator Provide a Locator using the DocumentHandler.setDocumentLocator callback (true), or explicitly do not provide one (false). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Fri Mar 12 14:27:48 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:56 2004 Subject: One more ModSax naming try... Message-ID: <85256732.004F61BA.00@D51MTA03.pok.ibm.com> For what it's worth, the approach the DOM has been looking at is that Level 2 classes which are subclasses/refinements of things that were present in Level 2 will be named by adding 2 as a suffix (Document2 et al). Simple, effective, extensible and indicates which version of the spec it refers to. Hence: SAX2? ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 14:45:24 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Feature Matrix Message-ID: <14057.8681.213869.498655@localhost.localdomain> It would be interesting to put together a feature matrix representing current practice among SAX parsers/drivers, at least in the Java world. Assuming that I simply wrapped the existing drivers with a Mod??Parser adapter, what features would and would not be supported? >From my fuzzy recollection, here's what AElfred supported just about a year ago before I gave it up: true false ------------------------------------------------------------------------ validation no yes external-general-entities yes no external-parameter-entities yes no namespaces no yes normalize-text yes no use-locator yes no (Wherever there's a 'no' answer, the driver should throw a SAXNotSupportedException). Actually, it would probably be safe to accept false for 'normalize-text' as well. If I were to wrap the AElfred driver, then, I'd do something like this (there's likely some kind of a static initialisation trap here, but it should be good enough as an unreliable example): public class AElfredModParser extends com.microstar.xml.SAXDriver implements org.xml.sax.ModParser { private static Hashtable featureTable = new Hashtable(); private static final Object TRUE = new Object(); private static final Object FALSE = new Object(); private static final Object TRUEFALSE = new Object(); private static final String FEATURE_NS = "http://xml.org/sax/features/"; static { featureTable.put(FEATURE_NS + "validation", FALSE); featureTable.put(FEATURE_NS + "external-general-entities", TRUE); featureTable.put(FEATURE_NS + "external-parameter-entities", TRUE); featureTable.put(FEATURE_NS + "namespaces", TRUE); featureTable.put(FEATURE_NS + "normalize-text", TRUEFALSE); featureTable.put(FEATURE_NS + "use-locator", TRUE); } public void setFeature (String featureID, boolean state) throws SAXNotSupportedException { Object allowedState = featureTable.get(featureID); if (allowedState == null) { throw new SAXNotRecognizedException(); } else if ((state && allowedState == FALSE) || (!state && allowedState == TRUE)) { throw new SAXNotSupportedException(); } } // etc. for setHandler, set, and get } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Mar 12 15:51:59 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:56 2004 Subject: Namespaces and DTDs References: Message-ID: <36E94F45.46D0E403@prescod.net> Didier PH Martin wrote: > > a) a Lisp document could be made SGML compliant because SGML can let you > define begin and end tag's delimiters (Ex: dsssl). I let this claim pass a couple of times because I didn't consider it important but now I feel the need to scratch that itch. DSSSL does not actually use parentheses as tags. If you use nsgmls to look at the SGML structure of a DSSSL document you will find that all of DSSSL's structure is actually in omitted tags. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 12 16:01:22 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:56 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <14053.51113.676945.877507@localhost.localdomain> Message-ID: <199903121543.KAA13555@hesketh.net> After a few months of intense busy-ness (and business), plus a trip to XTech that I'm still recovering from, I'm finally catching up to ModSAX. Printed it all out, looked it all over, and mostly I'm very pleased. I think it'll make implementing my Layered Model proposal as a a Layered Parser much easier overall. One key thing is missing at this stage, and that's a feature. At 08:16 PM 3/9/99 -0500, David Megginson wrote: >Here's my revised version of the core feature list, based on recent >discussions: > > >ModSAX Core Features >-------------------- > >http://xml.org/sax/features/validation > Validate (true) or don't validate (false). > >http://xml.org/sax/features/external-general-entities > Expand external general entities (true) or don't expand (false). > >http://xml.org/sax/features/external-parameter-entities > Expand external parameter entities (true) or don't expand (false). > >http://xml.org/sax/features/namespaces > Preprocess namespaces (true) or don't preprocess (false). See also > the http://xml.org/sax/properties/namespace-sep property. > >http://xml.org/sax/features/normalize-text > Ensure that all consecutive text is returned in a single callback to > DocumentHandler.characters or DocumentHandler.ignorableWhitespace > (true) or explicitly do not require it (false). > We need: http://xml.org/sax/features/external-subset Requires the parser to load the external subset of the DTD and process it. (External parameter entities remain a separate issue referenced by a separate feature.) This is critically important for attribute defaulting, which makes things like XLink much much simpler. At one point I switched parsers, only to find that my attribute values in the external subset had all disappeared. I promptly jumped back. The spec (5.1) allows non-validating parsers to skip the external subset; I'd very much like to have a way to tell the parser not to skip it, or at least know that they are in fact being skipped. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 12 16:45:04 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Dan Rudman >With the wealth of XML libraries available, I am more and more inclined to >make use of these libraries to help me create, parse, and utilize my own tag >markup language to be embedded within an HTML document. My understanding of >XML at this point is that it must be well-formed or a fatal error occurs. Yes, this is correct. >If this is the case, how can I deal with the fact that most HTML documents >are NOT well-formed and that most HTML design tools do not enforce, require, >or even sometimes support, well-formedness in a document? You might try Tidy as the initial step. Tidy can take bad HTML and spit out XML that could then be parsed by any XML parser. See http://www.w3.org/People/Raggett/tidy/ Hope this helps. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Mar 12 16:51:41 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <008201be6ca8$3b56c380$32acdccf@ix.netcom.com> Dan, use XHTML which is well formed xml Frank ----- Original Message ----- From: Dan Rudman To: 'XML-DEV' Sent: Friday, March 12, 1999 8:59 AM Subject: Basic Question >I apologize for the basic question in advance :) > > >With the wealth of XML libraries available, I am more and more inclined to >make use of these libraries to help me create, parse, and utilize my own tag >markup language to be embedded within an HTML document. My understanding of >XML at this point is that it must be well-formed or a fatal error occurs. >If this is the case, how can I deal with the fact that most HTML documents >are NOT well-formed and that most HTML design tools do not enforce, require, >or even sometimes support, well-formedness in a document? > >Things would be rosy if I didn't have to rely on HTML, but my application >requires it. > >Thanks. > >-- Dan > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Mar 12 16:56:18 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12 In-Reply-To: <14057.7534.612787.424789@localhost.localdomain> References: <14057.7534.612787.424789@localhost.localdomain> Message-ID: * David Megginson | | I've added one new feature -- http://xml.org/sax/features/use-locator | [...] | What does everyone else think? Good one! I'm in favour. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Mar 12 17:13:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:56 2004 Subject: ModSAX: Proposed Core Features References: <199903121543.KAA13555@hesketh.net> Message-ID: <36E94AF4.8A154C60@locke.ccil.org> Simon St.Laurent wrote: > Requires the parser to load the external subset of the DTD and process > it. (External parameter entities remain a separate issue referenced by a > separate feature.) I don't see why it should be. I think that parsers will either process just the internal subset, or will load all external DTD parts, including both the external subset and the external parameter entities. (Ignoring the internal subset is *not* an option, of course, except for DPH parsers.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Fri Mar 12 17:40:28 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:57 2004 Subject: empty tags and the XMl 1.0 spec Message-ID: <36E97AED.3F44F7D1@metalab.unc.edu> >From the XML spec, section 3.1: "Empty-element tags may be used for any element which has no content, whether or not it is declared using the keyword EMPTY. For interoperability, the empty-element tag must be used, and can only be used, for elements which are declared EMPTY." 1. The "can only be used" part of the second sentence seems to contradict the the first sentence. 2. "the empty-element tag must be used...for elements which are declared EMPTY" seems to contradict the assertiona that and are the same thing. Is there any way out of this conundrum? -- Elliotte Rusty Harold xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From indiketr at churchill.co.uk Fri Mar 12 17:48:21 1999 From: indiketr at churchill.co.uk (Rajeeva Indiketiya) Date: Mon Jun 7 17:09:57 2004 Subject: unsubscribe xml-dev In-Reply-To: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 18:03:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:57 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <199903121543.KAA13555@hesketh.net> References: <14053.51113.676945.877507@localhost.localdomain> <199903121543.KAA13555@hesketh.net> Message-ID: <14057.21849.286930.749213@localhost.localdomain> Simon St.Laurent writes: > We need: > > http://xml.org/sax/features/external-subset I agree that this functionality is required. The question is whether there is a strong case for making inclusion of the external DTD subset separately configurable from the inclusion of external parameter entities in general. I'd suggest not. Consider the following: ]> and ]> Except for the extra "%doc" entry in the entity name table, these two document type declarations look to me to be exactly equivalent; as a matter of fact, I've always considered the second to be simply a short-hand for the first. James Clark made a convincing case for separating the inclusion of external general entities for the inclusion of external parameter entities. Can anyone make a convincing case for separating the inclusion of external parameter entities from the inclusion of the external DTD subset? Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 18:09:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:57 2004 Subject: Basic Question In-Reply-To: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> References: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <14057.22147.177213.132967@localhost.localdomain> Dan Rudman writes: [on XML's well-formedness constraints] > If this is the case, how can I deal with the fact that most HTML > documents are NOT well-formed and that most HTML design tools do > not enforce, require, or even sometimes support, well-formedness in > a document? You'd best keep the two separate. Try including the following in the HTML: Now, the HTML can stay as it is, and the XML can be properly well-formed. This approach is already best practice for including CSS stylesheets (using ) and ECMA scripts (using
  <xdoc>some simple text</xdoc>
----- Original Message ----- From: Lippmann, Jens To: Sent: Monday, March 22, 1999 9:01 AM Subject: Tree view from IE5 >Is there a way to "borrow" the stylesheet that creates the XML tree in IE5 >for XML files without an attached stylesheet, or is the tree hardcoded into >the msxml.dll? > >Jens > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Mar 22 17:10:16 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:18 2004 Subject: Article on IE5 support of web standards... Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101> Out of interest along these lines, for those that don't already know - mozilla.org have released M3 (milestone 3) of mozilla. I'd consider this pre-alpha now (whereas gecko wasn't even that), and it's got some really neat features - like the whole UI for navigator and mail/news and editor is built in XML - and you can edit it and change the layout completely. Nice. However the disappointment is that it appears that they aren't using expat yet (why???). I tried editing my .xul file to contain this: Hello World!!! Obvioulsy totally wrongo. But it parsed it and displayed a

just fine. No error messages. I'm submitting a bug report now. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Buss, Jason A [SMTP:jabuss@cessna.textron.com] > Sent: Monday, March 22, 1999 3:49 PM > To: 'xml-dev@ic.ac.uk' > Subject: Article on IE5 support of web standards... > > I just got this in the mail, and this article seems to hit the whole > problem > with bringing XML mainstream: Vendors who pledge their allegiance to any > particular web standard, and then release products with that "close, but > not > quite; eventually..." support that just turns people (me included). I had > hopes for IE5, tried out the beta, and hoped for the best with the final > release. Oh, well.... > > http://www.computerworld.com/home/news.nsf/CWFlash/9903195web > > Netscape's turn..... > > Thanks to all... > > Jason A. Buss > Single Engine Technical Publications > Cessna Aircraft Co. > jabuss@cessna.textron.com > "Webstandards.... eventually.... *sigh*..." > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Mon Mar 22 17:18:07 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:10:18 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <001b01be7482$0e756e20$c8a8a8c0@thing1> Message-ID: Perilously close to 'Design Time' flag (a la Beans) - neat idea, but not quite as simple as it at first appears. Remember I (a 'writer') may want to see what my 'reader' sees in my 'IDE'. As I say, neat idea Gluck tim ______________________________ Reply Separator _________________________________ Subject: Re: SAX2 RFD: LexicalHandler draft v.1.1 Author: b.laforge (b.laforge@jxml.com) at unix,mime Date: 22/03/99 16:35 From: Kay Michael >It seems we are trying to provide two views of a document, the reader's view >and the writer's view. The reader's view needs to present roughly what's in >Snip< >Response snipped< Perhaps we should have a writer feature that we can turn on or off, which will give us two broadly different modes of operation. Other features may be turned on or off individually, but the default for those features may well depen d on the use of the parser by a reader or a writer. This also gives us a way to partition events--an interface for a set of events should not include both reader and writer events. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Mon Mar 22 17:32:42 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:10:18 2004 Subject: SQL queries expressed in XML Message-ID: <93CB64052F94D211BC5D0010A80013310EB3B8@WWMESS3.172.19.125.2> > > we recently had the idea to use XML to express SQL-like > queries (so this is > not about querying XML -- it is about using XML to express > queries). It > seems to me that we might not be the first ones; so has > anybody defined an > XML document type for expressing SQL queries? > I've thought about the question and some of my thoughts are implemented in SAXON's SQLStyleSheet, which is the beginnings of an XSL extension to allow a stylesheet to update an RDBMS with data from an XML source document. As always in this area the first problem is deciding how much of the syntax should be "angle brackets" and how much should be rules for the content of elements/attributes. The answer to that depends on tradeoffs between different modes of use. So the question is, who is going to use it, and what for? In particular if you are interested in queries, what are you planning to do with the results? Print them out, merge them into the DOM representation of the document, or what? Mike Kay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/e429f26a/attachment.htm From srn at techno.com Mon Mar 22 18:01:39 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:10:18 2004 Subject: Is this invalid? In-Reply-To: <001d01be7482$34a37290$31f96d8c@NT.JELLIFFE.COM.AU> (ricko@allette.com.au) References: <001d01be7482$34a37290$31f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <199903221750.LAA06301@bruno.techno.com> [Rick Jelliffe:] > From: Didier PH Martin > >Is this markup an invalid XML element? > > > > > > In which document is this listed? I know that these are > > all valid SGML markup and there is ISO documents on it, > > but where can I find W3C documents giving that > > information? > It is not valid XML. The AFDR markup type was invented to > signify that the file was *not* SGML or XML. (What a > strange thing to do: it is the kiss of death. ) Correction: The purpose of the XSL stylesheets? Is there some software out there (other than notepad) for creating XSL stylesheets that I haven't come across? I have used Arbortext's XML styler, but they aren't supporting or revising it anymore, so if anyone knows of a tool that aids in creating XSL stylesheets (the transformation part of the spec seems fairly well supported, looking for formatting only, or something that supports both) please point me towards it. I had heard something a while back about the XSL WG splitting the XSL draft, to separate the formatting and the transformation parts of the draft. Anyone hear anything likewise? Thanks... Jason A. Buss Single Engine Technical Publications Cessna Aircraft Co. jabuss@cessna.textron.com > -----Original Message----- > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > Sent: Monday, March 22, 1999 10:49 AM > To: 'xml-dev@ic.ac.uk' > Subject: RE: Article on IE5 support of web standards... > > Of course you could just as easily present it the other way round. We > have been working with MS XML software for about eight months now, but > have yet to work with a NS implementation. We've had to change our XSL > stylesheets twice (not a great problem), and our stuff keeps getting > better (of course I would say that, but I'll let you cast your verdicts > next week). I wish I was able to get that sort of experience and > exposure with other tools and products. > > No-one says you have to use non-finalised features. There's no shame in > waiting (no fun either). > > Regards, > > Mark > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Mon Mar 22 18:27:39 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:10:18 2004 Subject: SQL queries expressed in XML In-Reply-To: Your message of "Mon, 22 Mar 1999 16:16:53 GMT." <002001be747f$6613e100$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <199903221826.SAA08052@aniwa.sky> > > we recently had the idea to use XML to express SQL-like queries > > (so this is > > not about querying XML -- it is about using XML to express queries). It > > seems to me that we might not be the first ones; so has anybody defined an > > XML document type for expressing SQL queries? > > And just to widen this question slightly - assuming I do have an XML > representation > of a language construct - whats the best way to do the conversion from > the XML representation to the 'correct' language representation. > > Could I use XSL to do this - or would this be going against the grain? > > (Just to qualify this I'm relatively new to XML, and *extremely* new to > XSL). XSL doesn't seem to do very well where the desired output is not well formed. If your SQL queries have '"', '<', '>' or '&' in them, then you're going to start getting into kludges. perl or DSSSL would be better suited to the task. *Why* do you want to put your queries into XML? Do you need access to the structure of your queries? Perhaps you just need something that can be embedded comfortably in your XML documents. What you are trying to achieve is likely to affect how you approach the problem. I've got a problem to tackle soon which provides an example of a reason one might want to have queries in an XML format, and the implications it has for encoding of my queries. It may be that others are doing similar stuff - if so I'd like to hear about it. I have a steady flow of news material coming through my site. I have subscribers who receive material filtered from this according to custom preferences. Whenever a story comes through I need my system to turn around several thousand queries within a few minutes at worst (while not unduly slowing my web server). I want to offer more flexible customization than I have at present. Basically what I need to do is to invert the problem and define a query based on the story data which can be applied to the stored queries to find the set of queries which the story matches. (Did that make sense?) XML expression of queries appeals since it facilitates interchanging of queries and data. The XML query languages I'm aware of don't seem helpful though, as they tend to store query expressions as CDATA and don't expose the query structure. The sort of queries I want to do are boolean logic queries. Primitives I need are literal specification of element content or attribute content, or containment of particular words within the element contents. Extensions of this boolean model might include stemming (reasonably likely) and use of term weighting (probably not). These are amply discussed in the Information Retrieval literature for those who don't know about them. I figure any boolean query can be expressed as a decision tree terminating in true or false leaf nodes, that this maps well into XML, and that it should be able to be used to search for queries matching a given document using existing tools (eg sgrep). I believe this could lead to a relatively simple processing model, but it remains to be seen how efficient it will be. If anyone is aware of any relevant work that is being or has been done I'd appreciate hearing about it. XML or otherwise. Andrew McNaughton -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Mar 22 18:30:11 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:18 2004 Subject: XML complexity, namespaces (was WG) References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> Message-ID: <36F68B71.5967A029@w3.org> Marcus Carr wrote: > > Chris Lilley wrote: > > [a number of sideways kicks at SGML, then:] (Generally deserved, I thought) > > There are significant portions of the old SGML community working to > > improve XML and to help build the missing parts which are needed. I have > > a lot of rwespect for that portion. There are, as you say, other parts > > which are merely trying to save their own highly paid jobs as priests of > > complex, low-powered technology. One can usually tell the difference by > > noting that the former portion have their eyes open. > Spare me. The biggest driving factor behind people working in SGML > is the fact that there are clients who want work done. Uh, this is actually a fairly big driver for people working in XML too. > SGML is neither complex nor low-powered, as numerous defence, > telcos, legal publishers, stock exchanges, aircraft manufacturers, automotive companies, etc. > can attest. I'm not saying that its impossible to get value from it, or that it is without power. But it is significantly underpowered in some ways, and pays too big a price in parsing complexity for minor keystroke savings, and the original design constraints don't necessarily apply to todays applications, which is why I see XML as more powerful than SGML, not less, in spite of being (now) a subset of SGML. > Generalisations of the participants such as those above, create friction between > the XML and SGML camps and reveal an inate lack of understanding about the relationship > between the two. I will thank you to not to categorise me as either a "good XML groupie" or a > "garden gnome". ;-() Well if you are an SGML user who is not a) involved in furthering the XML effort, or b) involved in slowing down the XML effort then I didn't categorise you at all, since I was speaking of only two particular portions of the "old SGML community". There are, of course other portions; and there are, of course, other communities. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SMUENCH at us.oracle.com Mon Mar 22 18:43:29 1999 From: SMUENCH at us.oracle.com (Steve Muench) Date: Mon Jun 7 17:10:18 2004 Subject: Tree view from IE5 Message-ID: <199903221843.KAA00573@usmail04> IE5 uses a built-in stylesheet for this. It's not something that's meant to be encrypted in any way or hard to "capture". :-) >From Microsoft Reference material on IE5, you find: | When using an XSL style sheet, you can access the | XML source through the XML Document Object Model | (DOM). Two additional properties are exposed on | the document object from DHTML: | | document.XMLDocument | document.XSLDocument | | The XMLDocument property returns the root of the | XML source tree, and the XSLDocument property | returns the root of the XSL style sheet. By creating a two-frame frameset with an XML file being browsed in the left frame and some Javascript in the righ frame, you can write out the value of: parent.leftframe.document.XSLDocument.xml to a text file. A little cryptic, but a definitely learning tool. See below... Have fun. ____________________________________________________________ Steve Muench, Consulting Product Manager & XML Evangelist Java Business Objects Dev't Team - http://www.oracle.com/xml =/ Include /=
<!DOCTYPE (View Source for full doctype...)>
<? ?>
<?xml ="" ?>
xt ="" ="" =""
<!--
-->
<![CDATA[
]]>
<xt />
- <xt >
</xt>
<xt ></xt>
- <xt >
</xt>
-------------- next part -------------- An embedded message was scrubbed... From: "Frank Boumphrey" Subject: Re: Tree view from IE5 Date: 22 Mar 99 09:04:11 Size: 6082 Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/dce58ff7/attachment.eml From tgraham at mulberrytech.com Mon Mar 22 19:08:40 1999 From: tgraham at mulberrytech.com (Tony Graham) Date: Mon Jun 7 17:10:18 2004 Subject: XSL stylesheets (was Article on IE5 support of web standards.) In-Reply-To: References: Message-ID: <14070.20236.520000.566990@menteith.com> At 22 Mar 1999 12:25 -0600, Buss, Jason A wrote: > XSL stylesheets? Is there some software out there (other than notepad) for > creating XSL stylesheets that I haven't come across? See http://www.mulberrytech.com/xsl/xslide/ for information on my XSL mode for Emacs. I haven't finished updating it to match the current working draft, but it's still going to be better than using notepad. Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 22 19:22:40 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:19 2004 Subject: Expat and Mozilla (was RE: Article on IE5 support of web standards...) In-Reply-To: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101> References: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101> Message-ID: <14070.38971.971587.407210@localhost.localdomain> Matthew Sergeant (EML) writes: > However the disappointment is that it appears that they aren't using expat > yet (why???). If that's the case, it's probably because Expat would (correctly) throw out half of the pseudo-XML that the other Mozilla modules generate. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Mar 22 20:57:01 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:19 2004 Subject: SAX2: LexicalHandler draft v.1.1 Message-ID: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com> >public interface LexicalHandler >{ > public abstract void xmlDecl (String version, > String encoding, > String standalone) > throws SAXException; > Some of this stuff I've already dealt with in the internal event APIs of the new IBM parser, so I'd like to throw in a couple of points here.... and hopefully some of this is not off the actual topic, since I've been too busy to follow this thread as closely as I should have. If some of this really applies to another thread, then assume I really wrote it there :-) 1) The xmlDecl() needs another parameter. In addition to the encoding string, which is the exact text of the string in the document, some customers need to know what the actual encoding is (which might have been auto-sensed.) They need this in some cases to get the document back to the original encoding. So there should be an 'actualEncoding' parameter which is either the same as encoding (if there was an encoding string in the document) or the actual encoding used if not (probably in some canonical format, since there are only about 6 auto-sensed encodings right?) 2) I made the names for the comment, PI, and whitespace call backs on the DTD handler have different names from those of the ones on the document handler. This is somewhat safer in C++ since it means not having a single method override two pure virtuals from a mixin. It also allows the handler to be less stateful in the situation where the same object is implementing the handler for both document and DTD (since they then know that its for one or the other without having to keep flags for that stuff, which is not really a biggie but I thought it was worth it.) 3) I report whitespace in the DTD, so that it can also be pretty much exactly recreated. I only report this if I'm asked to (by an 'advanced callbacks' flag, which also controls comments and PIs being reported from the DTD.) 4) I have events for the begin/end of the internal subset. 5) I have a callback for notation decl, attlist decls, and attdefs, which are important. 6) I have a flag on each entity, element, etc... decl callback called 'isIgnored'. This lets the caller know that this one was ignore because it was a subsequent instance of a previously declared decl. So they don't need to keep it if they just care about actual content, but they do if they want to recreate the original document (which is extremely important to some folks.) 7) I haven't done this yet, but some customers are insisting that any event callback that reports a quoted string indicate whether single or double quotes were used (again for recreation of the original document.) This seems a bit over the top to me, since they are equivalent, but I guess the customer is always right even when he's wrong. That's all I can think of right now. It would really be nice if we could map all of the information that we go through the trouble (and overhead) of parsing to public APIs. Otherwise, customers end up using our internal event API in order to get the information that they require. This locks down our internal API more than we'd like, but there is little we can do about it if they *have* to have this extra info to do what they do. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Mar 22 21:01:06 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:19 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com> >From: Lars Marius Garshol >Date: 21 Mar 1999 18:14:13 +0100 >Subject: Re: SAX2 RFD: LexicalHandler draft v.1.1 > >* David Megginson >| >| public abstract void xmlDecl (String version, >| String encoding, >| String standalone) >| throws SAXException; > >Should we perhaps make standalone a boolean instead? It can only have >two values anyway, and this will spare us a lot of >standalone.equals(this or that). > I did that at first with my internal event APIs, but it didn't work out. There is then no way of knowing whether the document *really* said yes or no, or whether it was just no there at all and the default was used. This prevents the recreation of the original document. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Mar 22 21:49:36 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:10:19 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main> > public interface AttributeValueHandler > { > public abstract void startEntity (String name) > throws SAXException; > public abstract void endEntity (String name) > throws SAXException; > public abstract void characters (char ch[], int start, int length) > throws SAXException; > } > > public interface AttributeValue2 extends AttributeValue > { > public abstract boolean isSpecified (String name); > public abstract void accept (AttributeValueHandler handler) > throws SAXException; > } David, I don't think event-based interface is appropriate for this purpose. Why not introduce an interator or an array-like interface? Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Mar 22 22:06:07 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:19 2004 Subject: About Tim's article on XML References: <30649320C177D111ADEC00A024E9F297169FA7@exchange-server.dega.com> Message-ID: <36F6B38F.E51161C7@w3.org> Ed Howland wrote: > > Everyone, > > The correct link for this article should be: > http://www.xml.com/xml/pub/1999/03/ie5/first-x.html Well, it is no more correct than the other link, but it does reference a resource variant written in HTML rather than in XML. I guess Didier felt that, to this list in particular, it was reasonable to point to the XML resource variant. > From: Didier PH Martin [mailto:martind@netfolder.com] > I read Tim's article in XML.com with interest (Ref: > http://www.xml.com/1999/03/ie5/first-x.xml). -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 22 22:07:38 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:19 2004 Subject: XSA client kit released Message-ID: A kit with a Java client for automatically monitoring XSA documents has now been released at The client that comes with the kit can be used to automatically discover changes to a set of XSA documents (addresses, new versions, new products etc). The kit also contains an API that can be used to build custom clients (or other kinds of XSA-aware software). The kit has already been used for a while by the maintainers of and . See for information about XSA. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Mar 22 22:17:09 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:19 2004 Subject: eyes open (was XML complexity, namespaces) References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1AFF5.2DF948A2@allette.com.au> <36F26A53.E4395E9C@goon.stg.brown.edu> Message-ID: <36F6C03B.1222E5DE@w3.org> "Richard L. Goerwitz" wrote: > > Marcus Carr wrote: > > > > So... how do we get back to SGML people not having their eyes open? > > Just for the record, I never said they were closed. I believe it was > Chris Lilley. Yes. > And when he said this, he wasn't characterizing the en- > tire SGML community. Correct > In fact, he was, overall, defending SGML. Glad someone noticed. Its not something I do often ;-) We now return to our regular programming. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Mar 22 22:30:40 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:10:19 2004 Subject: Oops! was Re: Transformation tool for windows Message-ID: <002801be74b3$37c9e9a0$a4addccf@ix.netcom.com> I put a buad version of the program on my site. It refuses to open because it references a file that I am sure is not on your computers.! I have put up a corrected version. I appologise to every one who tried to run the old version Please go to www.hypermedic.com/style to down load the zip file (20K). Look under transform XML. Frank ----- Original Message ----- From: Frank Boumphrey To: xml mailing list Sent: Thursday, March 18, 1999 1:03 AM Subject: Transformation tool for windows >At the suggestion of several people I am making generaly available a simple >tool that carries out batch transformations of XML files under windows 95, >98, or NT. Although stable, it is very much alpha ware and is still a 'work >in process'. I would be glad of any feed back from members of this list. > >It was written for an undergraduate class and requires no more skill's to >run than than basic windows skill's but in spite of that it is quite >powerful and can easily handle documents up to 2M in size. (I havn't tested >it on anything larger) > >This tool is exerpted from a larger editing tool which uses the MSXML >parser. However as the later is in flux and the MSXML dll has not been >released or liscensed for general use, I have split the transformation tool >off from the editing and DOM tool. > >'TransformXML' allows the following proceeses to be automated. > > 1. Creating a list of xml files for processing. > 2. Running a list of commands on each file. > 3. Transforming one xml nametag to another. > >It has not yet been optimized for speed. for example on a middle of the road >platform it takes about 1 minute to convert an XML file marked up by Jon >Bosak into HTML. It took 20 minutes to transform the complete works of >Shakespeare from xml to xhtml. > >Please go to www.hypermedic.com/style to down load the zip file (20K). Look >under transform XML. > >It uses the VB5 dll's which are also available if needed. > > >Frank Boumphrey > >XML and style sheet info at Http://www.hypermedic.com/style/index.htm >Author: - Professional Style Sheets for HTML and XML http://www.wrox.com >CoAuthor: XML applications from Wrox Press, www.wrox.com >Author: Using XML on the Web (March) > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sawhneya at ms.com Mon Mar 22 23:01:56 1999 From: sawhneya at ms.com (Avneet Sawhney) Date: Mon Jun 7 17:10:19 2004 Subject: XML and Sybase Message-ID: <36F6CBA3.3BD1734A@ms.com> Hi, I would like to know what other people are doing with respect to using XML in a Sybase environment. Will Sybase have any XML support in their SQL server? Are there other products which could be used? Most of the products I have seen for database integration seem to be Windows centric, or they are tied to other database servers. I want to leverage XML in the middle tier with the current Sybase environment, but I prefer to use products(besides parsers, etc.) that lend themselves to this work. I'm thinking i should not have to start from scratch. as others would have come up against this as well. BTW, with respect to other thread, I also thought one thing would be to use XML to express all interaction with the database. With some more detail, I guess this could be extended to also abstract the data model as well. I have started working on this, but I am looking for better ways to use XML in the middle tier. I know the "why's" of doing this, and I would like to get info on some better "how's" in a C++/UNIX environment. Thanks, -Avneet xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Mar 22 23:17:04 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:19 2004 Subject: Is this invalid? References: Message-ID: <36F6CDD6.72D1F8CE@w3.org> Didier PH Martin wrote: > I am trying to find something in W3C specs about it but with no sucess. By > the way, what is the official list of valid markups: > > ???, ???, both ???? > ????, ???, both ??? > others ???? > > In which document is this listed? Unless I am missing something here, the answer is really obvious - in the XML specification. http://www.w3.org/TR/REC-xml I presume you mean something more complicated? -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Mon Mar 22 23:39:03 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:19 2004 Subject: XML complexity, namespaces (was WG) References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> Message-ID: <36F6D46A.FB33D473@allette.com.au> Chris Lilley wrote on SGML: > Uh, this is actually a fairly big driver for people working in XML too. So... if I was taking potshots at it, it might be indicative that I'm missing something, right? The reason that people have been marking data up as SGML for many years now is because there was no XML. We were anticipating its arrival, but until it came we knew we had to mark up the data in some way that would ensure that it was useful for many years to come. We didn't know what future technology would hold and to some extent, still don't. What shape will the web be in ten years? Will loosleafing paper documents survive, or be forced out by smarter data handling and delivery? Two valid questions ten years apart. There is what I believe to be a misconception amonst some portions of the XML community that XML and SGML are locked in some sort of competition, but I don't see any of the same feeling from the SGML community. The conclusion that I draw from this is that there is some sort of insecurity, perhaps due to the fact that XML feels that it must replace SGML in order to ensure its survival. The SGML community sees XML as a great boon - a truly sweet way of using the data and realising the long-term effort that they have put into their datasets. Not only does XML address the long-term storage issues that motivated people to move to SGML (despite the absence of a proliferation of tools), it also addresses putting the data to work. The SGML community isn't bitter about this - it is exactly what we want. Many organisations now have huge SGML datasets that can be XML compliant in a couple of days. XML vindicates those of us who have been pushing SGML for years. > I'm not saying that its impossible to get value from it, or that it is > without power. But it is significantly underpowered in some ways, and > pays too big a price in parsing complexity for minor keystroke savings, > and the original design constraints don't necessarily apply to todays > applications, which is why I see XML as more powerful than SGML, not > less, in spite of being (now) a subset of SGML. One reason that SGML came about was that organisations were starting to amass large datasets that were being locked into proprietary applications. Two issues that (I suspect) drove features like tag omitability were the conversion of these large legacy sets and the perception that in the absence of SGML tools, politically, markup would have to as simple as possible. The standard may well have been skewed toward the user and away from the application developer, but I don't think it's fair to retrospectively bag this decision just because the methods that we now use to collect and tag data have evolved. Any general comparison of which is "more powerful" is invalid - XML is capable of very much more than SGML is, but someone with a huge repositiry of SGML documents that can be valid XML in a week is surely in a "more powerful" position than someone just starting to collect XML data. > Well if you are an SGML user who is not > > a) involved in furthering the XML effort, or > b) involved in slowing down the XML effort > > then I didn't categorise you at all, since I was speaking of only two > particular portions of the "old SGML community". There are, of course > other portions; and there are, of course, other communities. Is there a "new SGML community" as well? Now I'm not even sure what suburb I live in. I've put a down payment on a flash new house in XML, but I want to hold on to my familial house in SGML. It would be foolish to sell it now, when it continues to provide solid, long-term gains. Besides, I have friends there. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Mar 23 00:10:37 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:19 2004 Subject: IE5 Stylesheet Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF16A@RED-MSG-08> The style sheet used by IE5 tree view is available at http://www.microsoft.com/xml/xsl/tutorials/transform-defaultss.asp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From incze at mail.matav.hu Tue Mar 23 01:21:12 1999 From: incze at mail.matav.hu (Incze Lajos) Date: Mon Jun 7 17:10:19 2004 Subject: Mozilla/milestone3 Message-ID: <36F6ECCC.97FA872@mail.matav.hu> If anybody is interested - I just checked the new mozilla browser on Tim Bray's Explorer5 article in XML. I'm running Linux, so don't really know whait would be it look like on IE5. In the Mozilla it has a grey background with white background / red bordered boxes in it, red section headers and green anchor color. (They can be the defaults.) The rendering is acceptable. Incze xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 01:45:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <36F6D46A.FB33D473@allette.com.au> References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> Message-ID: <14070.56970.50161.169467@localhost.localdomain> Marcus Carr writes: > There is what I believe to be a misconception amonst some portions > of the XML community that XML and SGML are locked in some sort of > competition, but I don't see any of the same feeling from the SGML > community. The conclusion that I draw from this is that there is > some sort of insecurity, perhaps due to the fact that XML feels > that it must replace SGML in order to ensure its survival. The > SGML community sees XML as a great boon - a truly sweet way of > using the data and realising the long-term effort that they have > put into their datasets. XML does nothing that SGML cannot do. SGML does nothing that XML cannot do. There are some differences in the ways that XML and SGML accomplish the same thing, but those differences are trival and unimportant from an architectural perspective. XML benefited from (at the time) 12 years of SGML industry experience by eliminating a lot of original SGML features (such as the ability to vary the delimiter set or to omit tags) that turned out to be obfuscatory design mistakes. SGML benefits from 13 years of industry experience in the form of a small base of stable, production-quality COTS and OSS. The question, however, is whether there is a real benefit to supporting two slightly-variant standards that, in the view of a system architect, accomplish exactly the same thing in pretty much the same way. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 23 02:16:35 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:20 2004 Subject: RDF DTD ? Message-ID: <005a01be74d3$741139c0$11f96d8c@NT.JELLIFFE.COM.AU> I have written an XML DTD fragment for RDF and put it online at http://xml.ascc.net/xml/en/utf-8/resource-index.html I don't see why RDF WG didn't put XML declarations in: perhaps they were tired--certainly it was not difficult to make. A DTD would help many users, and also allow more informed commentary on the comparative virtues of RDF-schema and the various schema proposals. A combination of this DTD and an XSL-based structure validator should be enough to check all the structural constraints in RDF. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 23 02:25:00 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:20 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <36F62D81.A623C0A2@w3.org> from "Chris Lilley" at Mar 22, 99 12:46:09 pm Message-ID: <199903230328.WAA19702@locke.ccil.org> Chris Lilley scripsit: > Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation? The Recommendation basically says "We yield to the RFC once it is published". > > When the charset parameter is not specified, it is assumed as US-ASCII. > > Wow. So, what this RFC says is that, when used in email and on HTTP, the > encoding declaration is *always ignored*. Unfortunately this is a side effect of the rules for the media type "text/*", which says that the default value of "charset" is always US-ASCII. The alternative is to use "application/xml", which has no such obnoxious rule. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 23 02:29:14 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) Message-ID: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson SGML does nothing that XML cannot do. I don't know how Dave can say that. For example, many asian documents use user-defined characters (East Asian character sets have a special code space reserved for these, and East Asian word processing applications come bundled with font editors to allow definition of user-defined characters). In SGML I can short-reference these codepoints to entity which points to the appropriate glyphs and which has other data attributes to describe character properties. In XML, to do this I have to write a special program to simulate this behaviour. And if the program just inserts elements rather than entity references (because XML has no attributes on entities, so I have to use elements), my element structure is made more complicated. Furthermore I cannot use elements inside attribute values, while I can use entity references. The lack of this kind in XML has closed off the obvious and simple solution to private-use area (PUA) characters: East Asians and MathML could each have found it useful. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 23 02:38:29 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) Message-ID: <007201be74d6$87ed6ce0$11f96d8c@NT.JELLIFFE.COM.AU> From: Chris Lilley Well if you are an SGML user who is not > >a) involved in furthering the XML effort, or >b) involved in slowing down the XML effort > >then I didn't categorise you at all, since I was speaking of only two >particular portions of the "old SGML community". There are, of course >other portions; and there are, of course, other communities. It would be interesting if Chris would name names and give examples. Who are these portions of the "old SGML community"? Is it Steve Newcomb or Dave Megginson or Paul Prescod or Dave Peterson or even me? I think it is dishonest argument to allude to sinister forces without naming them or their particular views. Frankly, it makes it sound like Chris is inventing bogus boogymen as an argument for moving XML in non-standard directions. Who are these people from the community formerly known as SGML who are involved in "slowing down the XML effort"? I don't believe they exist. In fact, it sounds like "slowing down the XML effort" is synonymous in Chris' mind with "wanting XML to be standard", which has certainly not been demonstrated: in fact, there are calls for greater layering, not for increased divergence. It should be plain to everyone by now that W3C specifications are not treated by vendors as standards which should be strictly adhered to: they are treated as sources of APIs which can be embraced and extended, or partially implemented. W3C does not have the authority to check, demand or expect conformance: either moral or legal. A W3C spec needs all the help it can get to ensure complete implementation: being an ISO standard helps. One can see the same attitude at work with the MIME RFC: it was thoroughly debated by the XML WG and SIG, with input from other major goups such as WebDAV, and has been out for quite a while. But if someone doesn't agree they are quite happy to be non-conforming. Standards are a discipline: it is easy to diverge and difficult to interoperate. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 02:42:53 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> References: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14070.64712.690500.339308@localhost.localdomain> Rick Jelliffe writes: > From: David Megginson >SGML does nothing that XML cannot do. > > I don't know how Dave can say that. >From a system-architecture perspective, my statement is true -- what we're discussing here are simply implementation details. I agree that having to use PUA characters rather than special entities is a mild annoyance (in the past, I have dealt with similar problems trying to represent specialised characters in early medieval English manuscripts, including variant graphemes of the same graph). > In SGML I can short-reference these codepoints to entity which > points to the appropriate glyphs and which has other data > attributes to describe character properties. > > In XML, to do this I have to write a special program to simulate > this behaviour. In SGML, you have to write a special program to act on the information in the data attributes (nothing does this out of the box); in XML, you have to write a special program to act on the PUA. I'd say that SGML wins a 5.2 out of six 6 on non-canonical characters (because its approach is slightly more modular and maintainable), while XML wins a 5.0 (because it still works). But again, you *can* represent non-canonical characters in both, and the difference is too trivial to interest anyone but hard-core SGML wonks like Rick and me -- it certainly wouldn't be worth spending time on at a large project-management meeting. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Tue Mar 23 02:45:11 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> Message-ID: <36F70005.3D1F76D2@allette.com.au> David Megginson wrote: [... some excellent points about SGML and XML that I completely agree with, then:] > The question, however, is whether there is a real benefit to > supporting two slightly-variant standards that, in the view of a > system architect, accomplish exactly the same thing in pretty much the > same way. No question - it would be better if there was a single standard, but the demise of SGML should be natural, driven by nothing other than natural attrition. If it is to go, it will go because organisations finish mapping datasets across and start using some of the sexy new tools that we're currently waiting for, obviating the need for SGML. It may well eventuate that SGML ceases to be required, but until that time, we have a responsibility to ensure that discussion of the relative positions of the two should be predominately free of passion and politics. (Yes, that should apply to both sides and no, the previous comment was not directed at David - I may not agree with all of his opinions, but I believe them to be well-considered.) -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Tue Mar 23 03:24:48 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:20 2004 Subject: RDF DTD ? References: <005a01be74d3$741139c0$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <36F7095A.F9744939@allette.com.au> Rick Jelliffe wrote: > I have written an XML DTD fragment for RDF and put it online at > http://xml.ascc.net/xml/en/utf-8/resource-index.html I think this should be http://xml.ascc.net/xml/en/utf-8/resource_index.html. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Tue Mar 23 03:28:25 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:20 2004 Subject: Is this invalid? In-Reply-To: <36F6CDD6.72D1F8CE@w3.org> Message-ID: Hi Chris, Unless I am missing something here, the answer is really obvious - in the XML specification. http://www.w3.org/TR/REC-xml I presume you mean something more complicated? Steve Newcomb gave me a good answer about the markup. And I should thank him for giving this information. I didn't knew the origin of this markup and encountered it often for Hytime documents. I thought that this was transfered to XML because XML is supposed to be a subset of SGML. But Steve brought the information that this markup is not even a SGML standard because it is not yes part of the SGML new spec. So, let's put that in the perspective that's its a common practice but not yet part of a published standard. About the uppercase lowercase for prolog declarations I finally found the clause specify uppercase by example like in the following clause: [52] AttlistDecl ::= '' [53] AttDef ::= S Name S AttType S DefaultDecl I got some doubt after being argued that these reserved keyword should be uppercase and lowercase, my parser only accept uppercase. I am reassured now, the parser is OK. Regards xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Tue Mar 23 03:38:13 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: Hi From: David Megginson SGML does nothing that XML cannot do. By simple curiosity: Is it possible to declare an architectural instance from an architectural form in XML by strictly following the XML 1.0 spec? I do not mean here to simply have the architectural elements as our element properties but to declare in the prolog the correspondance between each markup and each architectural element. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 23 03:47:47 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> from "Rick Jelliffe" at Mar 23, 99 01:31:12 pm Message-ID: <199903230450.XAA22944@locke.ccil.org> Rick Jelliffe scripsit: > In SGML I can short-reference these codepoints to entity which points to > the appropriate glyphs and which has other data attributes to describe > character properties. > > In XML, to do this I have to write a special program to simulate this > behaviour. At last, someone who wants the LocalMarkupFilter (level 1) I specced out but never implemented because everybody pooh-poohed it. This is a SAX filter that detects some PIs and processes character data. Each properly declared character in the content of a specified element is transformed into an empty element. Here are the PIs: says that any characters in the content of the element "elementname" are transformed according to "mapname". says that when map "mapname" is in effect, the character "x" is changed into an empty element named "elementname", with an attribute "char" saying what the character was. This is not as flexible as shortrefs, assuming I understand them correctly (can be more than one character long, and are transformed into an arbitrary entity, not a fixed element name) but is easily layered over SAX. Are you interested? -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Tue Mar 23 04:31:13 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <14070.56970.50161.169467@localhost.localdomain>; from David Megginson on Mon, Mar 22, 1999 at 08:45:53PM -0500 References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> Message-ID: <19990323153036.A9794@io.mds.rmit.edu.au> On Mon, Mar 22, 1999 at 08:45:53PM -0500, David Megginson wrote: > Marcus Carr writes: > > > There is what I believe to be a misconception amonst some portions > > of the XML community that XML and SGML are locked in some sort of > > competition, but I don't see any of the same feeling from the SGML > > community. The conclusion that I draw from this is that there is > > some sort of insecurity, perhaps due to the fact that XML feels > > that it must replace SGML in order to ensure its survival. The > > SGML community sees XML as a great boon - a truly sweet way of > > using the data and realising the long-term effort that they have > > put into their datasets. > > XML does nothing that SGML cannot do. When developing the TOC management system for our document fragmenting toolkit, we chose XML to represent the TOC. SGML was not an option, because we didn't know the content model in advance and couldn't build it automatically from the DTD's of the individual documents. Also, we couldn't use a homogeneous element tree with attributes, because we actually extracted structured content from the documents for insertion into the TOC (sure, we could have serialised the content into an SGML attribute, but that would have a been perverse and painful alternative to simply using XML). > SGML does nothing that XML cannot do. On several occasions I have had to import textual information, and have been able treat the data as SGML with appropriate choice of shortrefs. With XML I would have been forced to write an intermediate translation layer and would have consequently lost the originals (or been forced to store the original and transformed document, or add the extra layer to every access). True, they are not always adequate for the job, but I certainly would not have happily forgone them in my project because they wouldn't have been useful in someone else's project! > There are some differences in the ways that XML and SGML accomplish > the same thing, but those differences are trival and unimportant from > an architectural perspective. Whether the differences are trivial is a matter for the requirements spec. to decide, rather than something you can decree in a priori fashion. There will often be cases where such "trivial" differences can have a profound impact on the cost and complexity of a project. > XML benefited from (at the time) 12 years of SGML industry experience > by eliminating a lot of original SGML features (such as the ability to > vary the delimiter set or to omit tags) that turned out to be > obfuscatory design mistakes. I couldn't disagree more, for all the above reasons. One man's design mistakes are another man's salvation. > SGML benefits from 13 years of industry experience in the form of a > small base of stable, production-quality COTS and OSS. > > The question, however, is whether there is a real benefit to > supporting two slightly-variant standards that, in the view of a > system architect, accomplish exactly the same thing in pretty much the > same way. If they did, there might be an issue to resolve. But they don't, so there isn't. Both will continue to be used and developed, and this is as it should be. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 23 04:49:06 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:20 2004 Subject: XML complexity, namespaces (was WG) Message-ID: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson In SGML, you have to write a special program to act on the information >in the data attributes (nothing does this out of the box); in XML, you >have to write a special program to act on the PUA. Huh? OmniMark allows access to data attributes just as easily as element attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm), out of the box. Several CALS-aware tools understand the notations used in data attributes, e.g., when used for graphics. And I dont agree that elements and characters and attributes and entities should be thought of as interconvertable: search routines look for character codes--I don't know of any search routines which allow grepping on data and elements. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Mar 23 05:33:04 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:10:20 2004 Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG)) References: <199903230450.XAA22944@locke.ccil.org> Message-ID: <007f01be74ee$14b07660$0300000a@cygnus.uwa.edu.au> ----- Original Message ----- From: John Cowan > At last, someone who wants the LocalMarkupFilter (level 1) I specced > out but never implemented because everybody pooh-poohed it. [...] > Are you interested? I like the idea of it. And on a different problem but, I think, similar solution, could you do local character data mapping the same way? I'd like a nice way of being able to use some transliteration when hand-editing XML and have it mapped to the appropriate Unicode code points (eg I'd like to say "in this element, map B to β") Mind you, I need to be able to map more than one character. [...] > Here are the PIs: > > says that any characters in the > content of the element "elementname" are transformed according to > "mapname". This sounds like a job for notations, where each mapname is a notation. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 23 05:38:02 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:20 2004 Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG)) In-Reply-To: <007f01be74ee$14b07660$0300000a@cygnus.uwa.edu.au> from "James Tauber" at Mar 23, 99 01:21:31 pm Message-ID: <199903230642.BAA26432@locke.ccil.org> James Tauber scripsit: > And on a different problem but, I think, similar solution, could you do > local character data mapping the same way? I'd like a nice way of being able > to use some transliteration when hand-editing XML and have it mapped to the > appropriate Unicode code points (eg I'd like to say "in this element, map B > to β") > > Mind you, I need to be able to map more than one character. But still a 1-1 mapping? That would be easy to incorporate. > This sounds like a job for notations, where each mapname is a notation. But XML notations don't have attributes, so what is gained? -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Mar 23 06:06:51 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:10:21 2004 Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG)) References: <199903230642.BAA26432@locke.ccil.org> Message-ID: <00c701be74f2$ceb18280$0300000a@cygnus.uwa.edu.au> > > Mind you, I need to be able to map more than one character. > > But still a 1-1 mapping? That would be easy to incorporate. I was initially concerned that what I wanted involved context-sensitive mapping but I no longer think it does, so, yes, it should be easy. > > This sounds like a job for notations, where each mapname is a notation. > > But XML notations don't have attributes, so what is gained? I was thinking of just the association of map with element (ie the first PI). If I understand correctly, your first PI associates the mapping with all elements of a named type. I would like the flexibility of being able to control that on an element-by-element basis. An attribute seems a good way of doing this (if all elements of a type have the same mapping, you can have an attribute default). So what you end up with is an attribute that, in effect, is saying how to process the character data in the content (sounds like a notation attribute right?) It would be useful, I think, to make such a specification independent of the particular usage by the LocalMarkupFilter. Other applications might want to know about it to. So a more general solution, IMHO, would be to have the mapping triggered by notation. In fact, rather than *replacing* your first PI, that PI could remain but instead map a mapname to a notation. To me that is more in the spirit of descriptive/generic markup. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From grove at infotek.no Tue Mar 23 07:48:31 1999 From: grove at infotek.no (Geir Ove Grønmo) Date: Mon Jun 7 17:10:21 2004 Subject: ANN: tmproc 0.10, a Topic Map implementation Message-ID: Hello, I'm pleased to announce the first release of tmproc, a Topic Map processor. This release is meant to be a technology preview. Enjoy! Geir O. -------------------------------------------------------------------------- Title: tmproc Version: 0.10 Released: March 23rd 1999 Author: Geir O. Gr?nmo, grove@infotek.no Homepage: http://www.infotek.no/~grove/software/tmproc/index.html Requirements: - Python 1.5.1 or newer [1] - An SGML/XML parser with a SAX driver - SAX for Python [2] - xmlarch 0.25, optional unless architectural processing is needed [3] - -- >>> What is tmproc? tmproc is an implementation of the new international standard ISO/IEC 13250 Topic Maps[4]. tmproc is written in Python, and it should work on any platform to which Python have been ported[2]. tmproc is a set of classes that represents a framework for doing topic map processing in Python. The current release includes the following set of classes: o classes for representing topic map objects like TopicMap, Topic, TopicName, Occurrence, Locator, Association, AssociationRole, Facet and FacetValue. o a factory class for creating topic map objects. o a class for importing topic maps, TMImporter. It listens to SAX events and use a factory class and interfaces to build a Topic Map. o an export class, TMExporter, that emits SAX events in the topic map interchange format so that any SAX document handler may be used for export. o statistical and information printing classes, TMUtils and TMStats. A command line utility is also included in the distribution. The implementation is currently based on a draft released some time before the final ballot. Some deviations from the - soon to be released - final standard is expected. Currently only a in-memory implementation is available. A relational database implementation have also been written, but is not available in the distribution because it is a bit crude at the moment. Fortunately tmproc has been written in a way that makes it easy to do additional implementations. - -- >>> Some of the features are: o Import, export, query and manipulation of topic maps. o Full set of extensible topic map classes with clearly defined interfaces. Association, AssociationRole, Facet, FacetValue, Locator, Occurrence, Topic, TopicMap, TopicMapFactory and TopicName. o Access to data in topic map objects using getter and setter methods. o Get types including transitive types of topics, associations and facets. o Get objects [e.g. topics, associations and facets] that are of given types or more specific types. o Get objects [e.g. associations] that exists in a scope or in any of the scopes' subscopes. o Optional architectural processing [requires xmlarch]. o Introduction and reference documentation. Suggestions and bug reports should be sent to: grove@infotek.no - -- [1] http://www.python.org/ [2] http://www.stud.ifi.uio.no/~larsga/download/python/xml/saxlib.html [3] http://www.infotek.no/~grove/software/xmlarch/index.html [4] Final CD Text for ISO/IEC 13250, Topic Navigation Maps, http://www.ornl.gov/sgml/sc34/document/0008.htm

tmproc 0.10 - an implementation of the new international standard ISO/IEC 13250 Topic Maps. (22-Mar-99) -- ================== Geir Ove Gr?nmo ================== | STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway | | grove@infotek.no http://www.infotek.no/ | ------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From grove at infotek.no Tue Mar 23 07:57:44 1999 From: grove at infotek.no (Geir Ove Grønmo) Date: Mon Jun 7 17:10:21 2004 Subject: ANN: xmlarch 0.25, an XML architectural forms processor Message-ID: xmlarch: An XML architectural forms processor written in Python Version: 0.25 Released: March 23rd 1999 Author: Geir Ove Gr?nmo Email: grove@infotek.no Homepage: http://www.infotek.no/~grove/software/xmlarch/index.html --- What is xmlarch? The xmlarch module contains an XML architectural forms processor written in Python. It allows you to process XML architectural forms using any parser that uses the SAX interfaces. The module allow you to process several architectures in one parse-pass. Architectural document events for an architecture can even be broadcasted to multiple DocumentHandlers. The main reason for releasing this version is to be able to support architectural processing with tmproc[1]. Topic Map processing relies heavily on the existence of the #GI mapping. What's new? - Added support for the new #GI mapping token. - Added a method called get_current_element_name() to the ArchDocHandler class, so that you can easily keep track of the original generic identifier. Fixes: - Bug related to the mapping between attributes and content. - Some minor ones. [1] http://www.infotek.no/~grove/software/tmproc/index.html --- Enjoy! Geir Ove Gr?nmo -- ================== Geir Ove Gr?nmo ================== | STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway | | grove@infotek.no http://www.infotek.no/ | ------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Mar 23 08:08:01 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:21 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903230328.WAA19702@locke.ccil.org> Message-ID: <36F74B26.21CF46EC@w3.org> John Cowan wrote: > > Chris Lilley scripsit: > > > Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation? > > The Recommendation basically says "We yield to the RFC once it is > published". > > > When the charset parameter is not specified, it is assumed as US-ASCII. > > > > Wow. So, what this RFC says is that, when used in email and on HTTP, the > > encoding declaration is *always ignored*. > > Unfortunately this is a side effect of the rules for the media type > "text/*", which says that the default value of "charset" is always US-ASCII. The default rules if no other rule is in place for a specific Media type. The registration for text/xml can overridfe this behaviour if it wishes to. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Mar 23 08:48:12 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:21 2004 Subject: Mozilla/milestone3 Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A16E3@EUKBANT101> The rendering was exactly correct on Win32 M3. More than that - I could do a "view source" and get the XML (but not the XSL - but I think that's expected - you can't do that with css either). This 6 months is going to be a long but fun wait... Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Incze Lajos [SMTP:incze@mail.matav.hu] > Sent: Tuesday, March 23, 1999 1:22 AM > To: xml-dev@ic.ac.uk > Subject: Mozilla/milestone3 > > If anybody is interested - I just checked the new > mozilla browser on Tim Bray's Explorer5 article in XML. > I'm running Linux, so don't really know whait would be > it look like on IE5. In the Mozilla it has a grey background with white > background / red bordered boxes > in it, red section headers and green anchor color. (They can be the > defaults.) The rendering is acceptable. > Incze > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Mar 23 08:53:15 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:21 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903230328.WAA19702@locke.ccil.org> Message-ID: <36F755CB.C996CE2D@w3.org> John Cowan wrote: > Chris Lilley scripsit: > > Wow. So, what this RFC says is that, when used in email and on HTTP, the > > encoding declaration is *always ignored*. > > Unfortunately this is a side effect of the rules for the media type > "text/*", which says that the default value of "charset" is always US-ASCII. > The alternative is to use "application/xml", which has no such > obnoxious rule. So, in consequence: example file such as the Chinese XML examples at http://xml.ascc.net/xml/test/index.html (where each example is available in UTF-8, Big5 and GB2312, all correctly labelled in the XML encoding declaration) are now sets of invalid XML files which are required to produce a critical error because of the invalid byte sequences in what is now described as a US-ASCII file? This is deeply counterproductive, and could have been avoided. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Tue Mar 23 08:56:37 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:21 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>; from Rick Jelliffe on Tue, Mar 23, 1999 at 03:51:11PM +1100 References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <19990323195615.C9794@io.mds.rmit.edu.au> On Tue, Mar 23, 1999 at 03:51:11PM +1100, Rick Jelliffe wrote: > > From: David Megginson > >In SGML, you have to write a special program to act on the information > >in the data attributes (nothing does this out of the box); in XML, you > >have to write a special program to act on the PUA. > > Huh? OmniMark allows access to data attributes just as easily as element > attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm), > out of the box. Several CALS-aware tools understand the notations used > in data attributes, e.g., when used for graphics. > > And I dont agree that elements and characters and attributes and > entities should be thought of as interconvertable: search routines look > for character codes--I don't know of any search routines which allow > grepping on data and elements. SIM builds indexes on arbitrary expressions. This allows you to index content, attributes, and even processing instructions if you want. When doing path indexing, a search engine can treat attributes as nodes of the tree rather than special things attached to nodes (one possibility is to treat them as child elements with an '@' in front of the element name). Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Tue Mar 23 09:49:09 1999 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:10:21 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> This is an interesting thread. Many non-tag-minimization reliables can be put forth as things that SGML "can do" that XML cannot. Things like data attributes, exclusion exceptions, internal SDATA entities and so on. I see SGML and XML at opposite ends of a balanced lever. On one side we have SGML - high on declarative syntax, low on home grown code. On the other side we have XML - low on declarative syntax, high on home grown code. SGML gives you declarative syntax that can obviate the need for coding around certain types of data modelling, content authoring problems. XML is light on the declarative syntax, leaving more in the realm of "application specific" implementation in a programming language. Ultimately, both views have their place and both may be "correct" for a given problem domain. For me, I favour the XML side of the lever. Any declarative syntax has its limits. It has been my experience that the limits of SGML's declarative syntax are quickly reached.[1] Any SGML system I have ever worked on has a large collection of ancilliary software to perform validation, data aggregation, authoring short-cuts that are not possible with pure SGML syntax. XML fills a nice 80/20 niche here. 20% of SGML's declarative syntax is used 80% of the time. XML draws a line in the sand saying "here is the most useful 80% in an allround cheaper package. You will need to write processing software on top of this but hey!. You would need to do that with SGML anyway." Analogies abound. What does it mean to say you have your data in third normal form in a relational database? It means that you have a base data model that is interchangeable amongst relational database systems. *But* and it is a big *but*. The rest of the stuff that makes up the solution is in some application specific 4GL. Declarative syntax does not put bread on my table. Solutions to business problems using the beautiful ideas of SGML puts bread on my table. XML gives me a nice package that gives me most of what I want in terms of a robust, simple, implementation of the SGML philosophy. I will build software around this package all day long without ever once missing an SGML feature. Whats more, I'll do it in an open, standardized, cheap programming language that gets the job done fast.[2] When I go under the bus, I believe my customers are in a better state than they would have been if I'd pulled every obscure SGML declarative syntax trick in the book. [1] Notations are for me, the classic example of the limitations of a declarative syntax and how a declarative syntax feature can subtly fool you into thinking you have solved a problem when all you have done is defer it. You hit, say, a data validation problem that cannot be solved with SGML syntax alone so you invent a notation for it. Knock up the declarative syntax for it. Lovely. It all parses. *However* the declarative syntax does not do anything. You still need to implement it as a processing layer above SGML. [2] Python http://www.python.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Tue Mar 23 11:08:04 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:21 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) In-Reply-To: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> References: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> Message-ID: * Sean Mc Grath | | For me, I favour the XML side of the lever. Any declarative syntax | has its limits. It has been my experience that the limits of SGML's | declarative syntax are quickly reached.[1] Any SGML system I have | ever worked on has a large collection of ancilliary software to | perform validation, data aggregation, authoring short-cuts that are | not possible with pure SGML syntax. I tend to favour the XML side myself (unless I have to write the documents manually), and I think most people will do so. To me, XML and SGML are a perfect example of what happens when the worse-is-better and the-right-thing philosophies collide. (Even though SGML doesn't really qualify as the-right-thing.) The main problem with SGML is the complexity of the syntax, which means that you need a large and complex application to get hold of your data, and as Gabriel prophesied this means that you have few choices of applications. For XML we are beginning to see what we never saw with SGML: a plethora of pluggable processing components. Much of this is due to SAX, I think, but much is also due to the simpler nature of XML syntax. I'm pretty sure that SAX2 will only reinforce this trend by making it easier to develop and plug together parser filters and other such components. Better design of XSL processors to allow the introduction of SAX components at various points (of which 4XSL seems to be a good example) would also help. Likewise with toolkits like SAXON. In fact, the only downside is that most of this is happening in a language as awkward as Java. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 11:15:16 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:21 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <36F70005.3D1F76D2@allette.com.au> References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> <36F70005.3D1F76D2@allette.com.au> Message-ID: <14071.29413.900398.832442@localhost.localdomain> Marcus Carr writes: > No question - it would be better if there was a single standard, > but the demise of SGML should be natural, driven by nothing other > than natural attrition. I agree, and in fact, it's not really a question of demise at all -- XML is just another iteration of SGML, and SGML is still the International Standard that provides its foundation. Everything that we learned in SGML is there in XML, and all the careful thought and person years of work from the Charles Goldfarb and the other members of the ISO subcommittees is the fundamental reason for XML's success. Essentially, the W3C just did what ISO was too slow at doing, and gave SGML a proper 12-year review; without the ISO baggage and the emotional attachment to the minutiae of ISO 8879:1986 esoterica, the W3C's SGML ERB cum XML WG was able to wield a sharp knife and cut away a lot of fat (though still not all of it). Very soon, I expect that ISO 8879 will pass the flag to XML and move to a legacy position (no one will be implementing new systems that use it), but that won't happen until the rest of the XML enterprise-level software support stabilises. Even then, there will be major SGML systems running for decades -- it is a credit to both the SGML and XML designers and cross-translation between SGML and XML for import/export is trivially simple, and that there will be few interop problems. As for standards bodies, I don't know. Perhaps XML will eventually migrate to an Internation Standards body of some sort -- who knows if the W3C will even exist in five years? -- or (and this might be preferable) the torch will pass to a new, better-constituted body that takes over both the W3C and IETF standards. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 11:28:43 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:21 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: References: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14071.30766.638574.493702@localhost.localdomain> Didier PH Martin writes: > By simple curiosity: Is it possible to declare an architectural > instance from an architectural form in XML by strictly following > the XML 1.0 spec? I do not mean here to simply have the > architectural elements as our element properties but to declare in > the prolog the correspondance between each markup and each > architectural element. Yes -- this works in both SGML and XML: in XML, the architectural declarations use alternatives to data attributes. Please, everyone, remember that my statement was that there is nothing that SGML does that XML cannot do (and vice-versa), not that they always do them in the same way. Please step back and take the perspective of a system architect, who is not concerned with the minutiae of tag omission, data attributes, or ignorable whitespace: XML and SGML both provide a clear-text serialisation format for a single-rooted hierarchical tree, with the ability to impose arbitrary directed graphs on top of that tree. Nodes are named and have named properties as well as children, and a node's children can contain both data and other nodes. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 11:34:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:21 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14071.31214.638556.368976@localhost.localdomain> Rick Jelliffe writes: > > From: David Megginson > >In SGML, you have to write a special program to act on the information > >in the data attributes (nothing does this out of the box); in XML, you > >have to write a special program to act on the PUA. > > Huh? OmniMark allows access to data attributes just as easily as element > attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm), Yes, so does SP. But (with the exception you note below) you still have to write an Omnimark or Perl or C++ program to act on the information in the data attributes. > out of the box. Several CALS-aware tools understand the notations used > in data attributes, e.g., when used for graphics. I agree that there are some tools already written that understand specific data attributes in specific cases, but the general case, you still have to write a specialised program (using Omnimark, Perl, or whatever) to do something useful with the data attributes, just as you have to write a specialised program (using Java, Perl, or whatever) to do something useful with PUA characters in XML. > And I dont agree that elements and characters and attributes and > entities should be thought of as interconvertable: search routines > look for character codes--I don't know of any search routines which > allow grepping on data and elements. Perhaps I misunderstood -- I thought that you were talking about the problem of including specialised, non-canonical characters in attribute values (say, to represent three variant 'd' graphemes in a 10th-century English manuscript or a customised Han character). I think that PUA characters provide a good solution for that problem -- the only difficulty is that all of the knowledge about those characters has to be encoded in the processing software using a lookup table, while the SGML data-attribute solution is slightly more modular since you can pass on extra generic information. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From reschke at medicaldataservice.de Tue Mar 23 11:50:51 1999 From: reschke at medicaldataservice.de (Julian Reschke) Date: Mon Jun 7 17:10:21 2004 Subject: SQL queries expressed in XML Message-ID: <001201be7523$5fb1b720$2e00a8c0@julian> Andrew McNaughton wrote: > > > we recently had the idea to use XML to express SQL-like queries > > > (so this is > > > not about querying XML -- it is about using XML to express queries). It > > > seems to me that we might not be the first ones; so has anybody defined an > > > XML document type for expressing SQL queries? > > > > And just to widen this question slightly - assuming I do have an XML > > representation > > of a language construct - whats the best way to do the conversion from > > the XML representation to the 'correct' language representation. > > > > Could I use XSL to do this - or would this be going against the grain? > > > > (Just to qualify this I'm relatively new to XML, and *extremely* new to > > XSL). > > XSL doesn't seem to do very well where the desired output is not well formed. > If your SQL queries have '"', '<', '>' or '&' in them, then you're going to > start getting into kludges. perl or DSSSL would be better suited to the task. > > *Why* do you want to put your queries into XML? Do you need access to the > structure of your queries? Perhaps you just need something that can be The idea was to reuse XML tools in a project which is XML related anyway. Expressing a query in XML instead of using a "proprietary" representation would allow us to use a standard parser to transform it into a object representation (DOM), and it would also have the benefit that standard tools could be used to actually enter or render a query string. > ... > I figure any boolean query can be expressed as a decision tree terminating in > true or false leaf nodes, that this maps well into XML, and that it should be > able to be used to search for queries matching a given document using existing > tools (eg sgrep). I believe this could lead to a relatively simple processing > model, but it remains to be seen how efficient it will be. Basically this is similar to our thinking... > If anyone is aware of any relevant work that is being or has been done I'd > appreciate hearing about it. XML or otherwise. This is precisely why I asked :-) -- Julian Reschke MedicalData Service GmbH (http://www.medicaldataservice.de) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From reschke at medicaldataservice.de Tue Mar 23 12:17:41 1999 From: reschke at medicaldataservice.de (Julian Reschke) Date: Mon Jun 7 17:10:22 2004 Subject: SQL queries expressed in XML Message-ID: <001001be7522$aa5aae40$2e00a8c0@julian> Kay Michael wrote: >> we recently had the idea to use XML to express SQL-like >> queries (so this is >> not about querying XML -- it is about using XML to express >> queries). It >> seems to me that we might not be the first ones; so has >> anybody defined an >> XML document type for expressing SQL queries? >> >I've thought about the question and some of my thoughts are implemented in >SAXON's SQLStyleSheet, which is the beginnings of an XSL extension to allow >a stylesheet to update an RDBMS with data from an XML source document. > >As always in this area the first problem is deciding how much of the syntax >should be "angle brackets" and how much should be rules for the content of >elements/attributes. The answer to that depends on tradeoffs between >different modes of use. So the question is, who is going to use it, and what >for? > >In particular if you are interested in queries, what are you planning to do >with the results? Print them out, merge them into the DOM representation of >the document, or what? I don't think that this is really relevant, because one might want to talk to a storage which doesn't even do anything XML related. However, to answer the question, I would expect to get the results either in a DOM or in an XML string. -- Julian Reschke MedicalData Service GmbH (http://www.medicaldataservice.de) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bhall at merrillhall.com Tue Mar 23 12:59:19 1999 From: bhall at merrillhall.com (Ben Hall) Date: Mon Jun 7 17:10:22 2004 Subject: MS XML 2.0 book cancelled Message-ID: I received the following email from Amazon.Com. At 09:23 PM 3/18/99 -0800, you wrote: > >Hello from Amazon.com! > >We have contacted the supplier by phone and are sorry >to report that the release of the following title has been >cancelled: > > Microsoft Corporation "Microsoft XML 2.0 Programmer's > Guide and Software Development Kit With CDROM" > >This unavailable item has been cancelled from your order. > >Your credit card will NOT BE CHARGED for this item. > >Your order has been cancelled. > >Thanks for shopping at Amazon.com, and we hope to see you again! > >Sincerely, > >Customer Service Department >Amazon.com >http://www.amazon.com >Earth's Biggest Selection > =================================== benjamin hall merrill-hall new media, inc. bhall@merrillhall.com =================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kurt.donath at lmco.com Tue Mar 23 14:08:01 1999 From: kurt.donath at lmco.com (Kurt Donath) Date: Mon Jun 7 17:10:22 2004 Subject: Microsoft XML 2.0? Message-ID: <36F79E80.84A9E617@lmco.com> Simon, You had posted a message to xml-dev about the Microsoft XML 2.0 book on sale at Amazon. I went and placed an order for it, then was informed today: "We have contacted the supplier by phone and are sorry to report that the release of the following title has been cancelled: Microsoft Corporation "Microsoft XML 2.0 Programmer's Guide and Software Development Kit With CDROM" This unavailable item has been cancelled from your order." Hmmm. Is this YOUR doing? Kurt Donath -- Kurt Donath 315.456.6276 Staff Systems Engineer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lockheed Martin - Enterprise Information Systems Systems Engineering / Webserv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Tue Mar 23 14:26:26 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:10:22 2004 Subject: small problem Message-ID: <005a01be7538$b5d4e7c0$91acdccf@ix.netcom.com> Can we make single XML file which contains the data and also style of >that data( How to display in the browser ) with out having another XSL Yes it is possible, use CSS Frank ----- Original Message ----- From: Jayadeva Babu Gali To: ; Sent: Tuesday, March 23, 1999 4:43 AM Subject: small problem >Hi, > >Can we make single XML file which contains the data and also style of >that data( How to display in the browser ) with out having another XSL >if its possible can u please correct the attaching file with this mail. > >/***** xml file with style sheet *****/ > > xmlns:xsl="http://www.w3.org/TR/WD-xsl" > xmlns="http://www.w3.org/TR/REC-html40" > result-ns=""> > > > > > Test > > > > > > > > > > > > >

>

> > > > > > > > > > > > jayadev > gali > > > shekar > ksirsagar > > > > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Tue Mar 23 14:55:36 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:10:22 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <36F74B26.21CF46EC@w3.org> (message from Chris Lilley on Tue, 23 Mar 1999 09:04:54 +0100) Message-ID: <199903231453.JAA00219@ruby.ora.com> > Date: Tue, 23 Mar 1999 09:04:54 +0100 > From: Chris Lilley > > The default rules if no other rule is in place for a specific Media > type. The registration for text/xml can overridfe this behaviour if > it wishes to. In theory, but not in practice. A processor that understands text/plain but not text/xml is allowed to use the rules for text/plain when encountering text/xml. So although text/xml can say, "Do X," a processor that doesn't know text/xml from text/adam may well do Y instead. Mandating that people who can't hear you must listen is not particularly effective. This is why application/xml exists: to avoid fallback text/* rules. > Date: Tue, 23 Mar 1999 09:50:19 +0100 > From: Chris Lilley > > So, in consequence: example file such as the Chinese XML examples at > http://xml.ascc.net/xml/test/index.html (where each example is > available in UTF-8, Big5 and GB2312, all correctly labelled in the > XML encoding declaration) are now sets of invalid XML files which > are required to produce a critical error because of the invalid byte > sequences in what is now described as a US-ASCII file? Describing files in encodings other than US-ASCII or ISO 8859-1 (or maybe other ISO 8859s) as text/anything is not a very good idea. The rules for text/* allow many unhealthy things; 8-bit data is not even a safe assumption, and line-end normalization can be a killer. The fallback rules for MIME's two-level hierarchy is only the final straw; for non-European encodings, I would use application/xml. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 17:41:42 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:22 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) In-Reply-To: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> Message-ID: <14071.33102.817935.724149@localhost.localdomain> Sean Mc Grath writes: > This is an interesting thread. Many non-tag-minimization > reliables can be put forth as things that SGML "can do" that > XML cannot. Things like data attributes, exclusion exceptions, > internal SDATA entities and so on. I think that I agree with what Sean is saying here and later in the message -- think of *what* you can represent rather than *how* you represent it. For instance, let's take a graphic where we want to provide the width, height, and colour depth to the processor. Here's a typical, declaration-heavy (Sean's term) SGML way to do it (except that a hard-core SGMLie would use public IDs): Here's a typical XML way to do it (also works in SGML): You're modelling exactly the same information about the picture in both -- data attributes provide an alternative mechanism for modelling the information, but they do not allow you to represent anything that you could not represent without them. > I see SGML and XML at opposite ends of a balanced lever. > On one side we have SGML - high on declarative syntax, low > on home grown code. On the other side we have XML - low on > declarative syntax, high on home grown code. > > SGML gives you declarative syntax that can obviate the > need for coding around certain types of data modelling, > content authoring problems. > > XML is light on the declarative syntax, leaving more > in the realm of "application specific" implementation > in a programming language. > > Ultimately, both views have their place and both > may be "correct" for a given problem domain. Right -- the question is not whether there is a benefit to continuing to develop the two in parallel, but whether the benefit will outweight the cost. We'll see what the market decides over the next few years. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 17:41:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:22 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <19990323153036.A9794@io.mds.rmit.edu.au> References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> <19990323153036.A9794@io.mds.rmit.edu.au> Message-ID: <14071.31774.199352.713952@localhost.localdomain> In his message, in a part that I'm not quoting (I do respond to specific details below), Marcelo Cantos argues that it's not for us to decide whether both full SGML and XML can co-exist, and I agree -- I am simply predicting that the market might not find it worthwhile to continue developing two standards that are architecturally identical and differ even in the implementation details only in nit-picky ways. Choose an arbitrary number for the cost of containing to develop two standards rather than one -- say, US$100M/year (if all of the big enterprise vendors have to develop, test, debug, document, support, and maintain both full SGML and XML versions of their software, as well as donate employees' time to committee work) and unaccountable additional hours of free time donated by OSS writers. Do SGML-specific features like SHORTREFs, data attributes, and omissible tags sometimes make life simpler for implementors? Of course they do. Are the differences worth US$100M/year (or whatever number you pick)? I don't know, and the decision is not ours to make, but the market will figure it out soon enough. Whatever happens, there will certainly be money to be made from supporting the existing SGML installations, so there will be good justification for backwards-compatibility in some major tools. Now, on to the specific points... Marcelo Cantos writes: > > XML does nothing that SGML cannot do. > > When developing the TOC management system for our document > fragmenting toolkit, we chose XML to represent the TOC. SGML was > not an option, because we didn't know the content model in advance > and couldn't build it automatically from the DTD's of the > individual documents. > > Also, we couldn't use a homogeneous element tree with attributes, > because we actually extracted structured content from the documents > for insertion into the TOC (sure, we could have serialised the content > into an SGML attribute, but that would have a been perverse and > painful alternative to simply using XML). There are work-arounds that you could have used in SGML, such as synthesised DTDs using ANY. Both SGML and XML *can* do this, but in your case, XML makes it a little easier (as would WebSGML). The differences are important to us, as SGML/XML implementors, but would not really concern the architect of a large system except to the point that they affected maintainability. > > SGML does nothing that XML cannot do. > > On several occasions I have had to import textual information, and > have been able treat the data as SGML with appropriate choice of > shortrefs. > > With XML I would have been forced to write an intermediate > translation layer and would have consequently lost the originals > (or been forced to store the original and transformed document, or > add the extra layer to every access). > > True, they are not always adequate for the job, but I certainly would > not have happily forgone them in my project because they wouldn't have > been useful in someone else's project! Or you could simply have defined a round-trip mapping -- tab-delimited fields map to elements map back to tab-delimited fields. You could also, with XML or SGML, point into the original without altering it (HyTime provides good mechanisms for doing that in SGML or XML). Again, however, Marcello is writing about implementation details, not about what SGML and XML are capable of representing in the abstract. In this case, SGML makes life a little easier for a *very* experienced designer under high-specialised circumstances. Lexically, SGML and XML differ in minor ways; logically, they are essentially identical. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eric at hellman.net Tue Mar 23 21:03:41 1999 From: eric at hellman.net (Eric Hellman) Date: Mon Jun 7 17:10:22 2004 Subject: IE5 and iso entities In-Reply-To: <004201be754e$00ddd730$3af96d8c@NT.JELLIFFE.COM.AU> Message-ID: At 3:55 AM +1100 3/24/99, Rick Jelliffe wrote: > >>>A test document (a technical article describing blue semiconductor >>>lasers, >>>>if anyone cares) is at http://nsr.mij.mrs.org/4/1/article.xml > >I have put the latest versions at http://www.ascc.net/xml/ >under the resources page. > >When I looked at your DTD I got a load error too. But I notice that your >version of ISOnum seems to be incomplete (at least, when I download it >to here it ends with a ------------------^ I replaced "%" with % then I got: The replacement text for a parameter entity must be properly nested with parenthesized groups. Line 43, Position 9 %ISOnum I removed the entities for [,],{,},(,) my new error was : An invalid character was found inside an entity reference. Line 191, Position 19 I tried changing it to and got the same error message. So I ask the list again: has anyone, anywhere, gotten IE5 to read ISO entity tables, or we going to have to do entity substituion on the server side? Eric Eric Hellman Openly Informatics, Inc. http://www.openly.com/ Tools for 21st Century Scholarly Publishing xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Tue Mar 23 22:17:10 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:22 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) References: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> Message-ID: <36F812B6.F46B5249@allette.com.au> Sean Mc Grath wrote: > SGML gives you declarative syntax that can obviate the > need for coding around certain types of data modelling, > content authoring problems. > > XML is light on the declarative syntax, leaving more > in the realm of "application specific" implementation > in a programming language. > > Ultimately, both views have their place and both > may be "correct" for a given problem domain. That was the topic of my presentation at the XML/SGML Asia Pacific conference last year (call for papers soon to be issued). If the deliverable is simply documents that conform to a certain structure, the most flexible approach would allow you to use an SGML or XML processor depending on the task. Provided the cost of this isn't excessive (sometimes it's nothing), it can be handy to use one processor or the other. Perhaps this is partly due to the dynamics of our organisation; typically we have data delivered to us in any format and we're expected to deliver back *ML. The clients usually want this to be as "black box" as possible, so we're free to implement whatever methods and tools we see fit. During conversion, we may use an SGML parser to aid with tag omitability, but increasingly our clients want valid XML data, so it must finally be parsed with an XML parser, as well as any stages that benefit from a well-formedness check. As David Megginson mentioned the other day, this may be difficult across an organisation, but it's not difficult across a conversion team. Although I know this won't work for everyone, I prefer to consider SGML and XML as two arrows in a quiver, not two quivers. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 22:23:07 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:22 2004 Subject: SAX2: LexicalHandler draft v.1.1 In-Reply-To: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com> References: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com> Message-ID: <14072.4772.577599.352783@localhost.localdomain> roddey@us.ibm.com writes: > >public interface LexicalHandler > >{ > > public abstract void xmlDecl (String version, > > String encoding, > > String standalone) > > throws SAXException; > > > 1) The xmlDecl() needs another parameter. In addition to the encoding > string, which is the exact text of the string in the document, some > customers need to know what the actual encoding is (which might have been > auto-sensed.) They need this in some cases to get the document back to the > original encoding. So there should be an 'actualEncoding' parameter which > is either the same as encoding (if there was an encoding string in the > document) or the actual encoding used if not (probably in some canonical > format, since there are only about 6 auto-sensed encodings right?) With the new SAX2 modular setup, it will be possible for people to create handlers that provide this level of detail if they want. I'm still wavering about including the XML Declaration at all. > 2) I made the names for the comment, PI, and whitespace call backs > on the DTD handler have different names from those of the ones on > the document handler. This is somewhat safer in C++ since it means > not having a single method override two pure virtuals from a > mixin. It also allows the handler to be less stateful in the > situation where the same object is implementing the handler for > both document and DTD (since they then know that its for one or the > other without having to keep flags for that stuff, which is not > really a biggie but I thought it was worth it.) That's an interesting suggestion -- I don't think that the state information is too much of a burdon, but we can watch closely. There's also an interop problem, since SAX 1.0 parsers already use DocumentHandler.processingInstruction() to report PIs in the DTD as well. > 3) I report whitespace in the DTD, so that it can also be pretty > much exactly recreated. I only report this if I'm asked to (by an > 'advanced callbacks' flag, which also controls comments and PIs > being reported from the DTD.) This is too far for the SAX core, but I'd encourage others to develop handlers like this (a crowded market is a healthy market). > 4) I have events for the begin/end of the internal subset. This information is available in the current lexical handler in a slightly different form: the start/endDTD() handler gives the overall boundaries, and the start/endEntity() call for "[dtd]" will delimit the external subset (if any); everything inside the DTD but outside the external subset (or other external parameter entities) is in the internal subset by default. > 5) I have a callback for notation decl, attlist decls, and attdefs, > which are important. Notations are already in SAX 1.0 (as required by the XML REC). The remainder will appear in DTDDeclHandler as soon as I have a chance to draft a proposal for it. > 6) I have a flag on each entity, element, etc... decl callback > called 'isIgnored'. This lets the caller know that this one was > ignore because it was a subsequent instance of a previously > declared decl. So they don't need to keep it if they just care > about actual content, but they do if they want to recreate the > original document (which is extremely important to some folks.) Yes, this is still an open question for DTDDeclHandler. > 7) I haven't done this yet, but some customers are insisting that > any event callback that reports a quoted string indicate whether > single or double quotes were used (again for recreation of the > original document.) This seems a bit over the top to me, since they > are equivalent, but I guess the customer is always right even when > he's wrong. That's precisely why SAX2 (I almost typed "ModSAX" -- sniff, sniff) is designed for easy extensibility and feature discovery. Business requirements will demand different types of support for different situations, and SAX2 provides a clean way to do that. I don't imagine that we'd put this kind of thing in the core, though. > That's all I can think of right now. It would really be nice if we > could map all of the information that we go through the trouble > (and overhead) of parsing to public APIs. Otherwise, customers end > up using our internal event API in order to get the information > that they require. This locks down our internal API more than we'd > like, but there is little we can do about it if they *have* to have > this extra info to do what they do. See my comments above on extensibility. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 23 22:33:21 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:22 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main> References: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main> Message-ID: <14072.5617.960838.731783@localhost.localdomain> Don Park writes: [dpm] > > public interface AttributeValueHandler > > { > > public abstract void startEntity (String name) > > throws SAXException; > > public abstract void endEntity (String name) > > throws SAXException; > > public abstract void characters (char ch[], int start, int length) > > throws SAXException; > > } > > > > public interface AttributeValue2 extends AttributeValue > > { > > public abstract boolean isSpecified (String name); > > public abstract void accept (AttributeValueHandler handler) > > throws SAXException; > > } [Don] > I don't think event-based interface is appropriate for this > purpose. Why not introduce an interator or an array-like > interface? Perhaps -- personally, I'm a little annoyed at having to do this at all. XML messed up a little here by making attribute values too difficult to process. The problem is that even if you don't care about entity boundaries, the XML 1.0 REC requires reporting of any entities that are not expanded (in the case, for example, of a non-validating parser that hasn't read the declaration in the external DTD subset). As a result, in a literal reading of the spec, a fully-conformant XML 1.0 API can *never* treat attribute values simply as strings. SAX 1.0 does so, and no one has ever minded, but conformance is conformance... All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed Mar 24 00:39:06 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:10:22 2004 Subject: IE5 and iso entities In-Reply-To: References: <004201be754e$00ddd730$3af96d8c@NT.JELLIFFE.COM.AU> Message-ID: <3.0.5.32.19990323193820.01539d60@nexus.polaris.net> At 04:05 PM 3/23/99 -0500, Eric Hellman wrote: >So I ask the list again: has anyone, anywhere, gotten IE5 to read ISO >entity tables, or we going to have to do entity substituion on the server >side? Yesterday on the XML-L list, John Robert Gardner (mailto: jgardner@blue.weeg.uiowa.edu) announced a demo that does that (among other things). His DTD is at: http://www.uiowa.edu/~etd/tdm.dtd The entitities declared all point to Rick's .pen files stored at James Tauber's schema.net site. (A couple of the files had minor typos until last week, but Rick has since cleaned them up.) The demonstration is at: http://www.uiowa.edu/~etd/front.xml and there's a discussion of concepts at: http://www.uiowa.edu/~etd/ As far as I know, JohnG is not actually referencing the entities anywhere in the instance -- he included them in the DTD simply for completeness and possible future use. Nevertheless, IE5 doesn't choke on their declarations. ========================================================== John E. Simpson | The secret of eternal youth simpson@polaris.net | is arrested development. http://www.flixml.org | -- Alice Roosevelt Longworth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed Mar 24 00:49:21 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:10:22 2004 Subject: FWD: Announcement - World Wide Web Wrapper Factory (W4F) Message-ID: <3.0.5.32.19990323194855.01539b40@nexus.polaris.net> I received this announcement via e-mail yesterday. It may (or may not :) be of interest to xml-dev and xml-l subscribers. Contact information is at the foot of the announcement. [Disclaimer: I have no affiliation with the W4F product development group. My correspondent, previously unknown to me, just happened on my website. Apologies for the cross-posting to subscribers of both lists.] >----- Looking at the Web through XML glasses, using W4F ----- > >The World Wide Web Wrapper Factory (W4F) is a Java toolkit to >generate wrappers for HTML data sources. > >Version 1.03 offers a built-in declarative mapping to XML. >Using W4F it is now possible to easily specify the translation >of HTML pages into XML documents. Moreover, the specification >gives for free the DTD. > >W4F consists of a retrieval language to identify Web sources, a >declarative extraction language (HEL: HTML Extraction Language) >to express robust extraction rules and a mapping interface to >export the extracted information into some user-defined data- >structures (text, Java objects, XML, etc.). >The wrappers are generated as Java classes that can be used as is >or integrated into higher-level applications. > >Version 1.03 provides some improved visual support to make the >creation of wrappers easier and faster. In particular, the >extraction of HTML can be done via a wysiwyg interface. > >The W4F toolkit comes as a Java package and can be downloaded from >the W4F web site. It is free for non-commercial use. >Various examples of running wrappers are also available for download >from the web site. > >Web site: >http://db.cis.upenn.edu/W4F > >Contacts: >Arnaud Sahuguet >Database Research Group, Univ. of Pennsylvania, PA, USA >sahuguet@gradient.cis.upenn.edu >http://www.cis.upenn.edu/~sahuguet > >Fabien Azavant >?cole Nationale Sup?rieure des T?l?communications, Paris, France >Fabien.Azavant@enst.fr >http://www.stud.enst.fr/~azavant ========================================================== John E. Simpson | The secret of eternal youth simpson@polaris.net | is arrested development. http://www.flixml.org | -- Alice Roosevelt Longworth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Wed Mar 24 00:54:27 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:10:22 2004 Subject: Keeping space XML -> XSL -> HTML Message-ID: <001201be7590$916c3200$4a5eedc1@arp01> Hi. If an element within an XML file is marked to preserve spaces, would one expect the spaces to be lost during an XSL transformation to HTML ? These seems to be the behaviour of IE5. Any ideas how to maintain the spaces ? Regards, Richard. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Wed Mar 24 01:17:20 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:23 2004 Subject: XML complexity, namespaces (was WG) In-Reply-To: <14071.31774.199352.713952@localhost.localdomain>; from David Megginson on Tue, Mar 23, 1999 at 06:53:00AM -0500 References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> <19990323153036.A9794@io.mds.rmit.edu.au> <14071.31774.199352.713952@localhost.localdomain> Message-ID: <19990324121655.A29837@io.mds.rmit.edu.au> On Tue, Mar 23, 1999 at 06:53:00AM -0500, David Megginson wrote: > In his message, in a part that I'm not quoting (I do respond to > specific details below), Marcelo Cantos argues that it's not for us > to decide whether both full SGML and XML can co-exist, and I agree > -- I am simply predicting that the market might not find it > worthwhile to continue developing two standards that are > architecturally identical and differ even in the implementation > details only in nit-picky ways. Well, I guess we are all entitled to prognosticate. My own personal view is that SGML is useful enough (over and above XML) in enough serious systems that it will not go away in the foreseeable future (which, admittedly, isn't that long in this industry). > Choose an arbitrary number for the cost of containing to develop two > standards rather than one -- say, US$100M/year (if all of the big > enterprise vendors have to develop, test, debug, document, support, > and maintain both full SGML and XML versions of their software, as > well as donate employees' time to committee work) and unaccountable > additional hours of free time donated by OSS writers. I personally doubt that the maintenance of two standards will have any noticeable impact on implementors. Our internal libraries are, for the most part, built and work nicely with either format. Furthermore, the major implementation effort involves the commonality, not the variability between the standards. As for the perspective of the standards architect, I can't make any real judgements on how much work is involved there. I would, however, speculate that standards are driven by demand more than by economics. > Do SGML-specific features like SHORTREFs, data attributes, and > omissible tags sometimes make life simpler for implementors? Of > course they do. > > Are the differences worth US$100M/year (or whatever number you > pick)? I don't know, and the decision is not ours to make, but the > market will figure it out soon enough. Whatever happens, there will > certainly be money to be made from supporting the existing SGML > installations, so there will be good justification for > backwards-compatibility in some major tools. And we still encounter new clients with new projects that are opting for SGML because XML doesn't satisfy their needs. The usual reason is having to deal with legacy data. But then one must ask how soon do you think legacy data will go away? I should point, however, that I am not arguing that SGML will continue to dominate the market. I believe that XML will increase dramatically in use and will ultimately become the dominant player by a wide margin. What I disagree with is the notion that SGML has no future role to play and will not be supported. > Now, on to the specific points... > > Marcelo Cantos writes: > > > > XML does nothing that SGML cannot do. > > > > When developing the TOC management system for our document > > fragmenting toolkit, we chose XML to represent the TOC. SGML was > > not an option, because we didn't know the content model in > > advance and couldn't build it automatically from the DTD's of the > > individual documents. > > > > Also, we couldn't use a homogeneous element tree with attributes, > > because we actually extracted structured content from the > > documents for insertion into the TOC (sure, we could have > > serialised the content into an SGML attribute, but that would > > have a been perverse and painful alternative to simply using > > XML). > > There are work-arounds that you could have used in SGML, such as > synthesised DTDs using ANY. Both SGML and XML *can* do this, but in > your case, XML makes it a little easier (as would WebSGML). The > differences are important to us, as SGML/XML implementors, but would > not really concern the architect of a large system except to the > point that they affected maintainability. But would you then seriously suggest that maintenance is not a significant component of a project's cost? Of course SGML can do it, but the question boils down to whether it's worth it. We, as implementors, consider it far more cost effective to maintain two standards (the cost is really quite minimal, IMHO) than to insist on one or the other. To say that SGML does everything XML does is ignoring the fact that implementation details really do matter. It is like saying that a spreadsheet can do everything a word processor can. Of course it can, but that's not the point. In any event, since the issue is whether XML will replace SGML, not vice-versa, the "XML does nothing that SGML cannot do" comment is a bit of a red herring. The latter statement is far more pertinent. > > > SGML does nothing that XML cannot do. > > > > On several occasions I have had to import textual information, > > and have been able treat the data as SGML with appropriate choice > > of shortrefs. > > > > With XML I would have been forced to write an intermediate > > translation layer and would have consequently lost the originals > > (or been forced to store the original and transformed document, > > or add the extra layer to every access). > > > > True, they are not always adequate for the job, but I certainly > > would not have happily forgone them in my project because they > > wouldn't have been useful in someone else's project! > > Or you could simply have defined a round-trip mapping -- > tab-delimited fields map to elements map back to > tab-delimited fields. You could also, with XML or SGML, point into > the original without altering it (HyTime provides good mechanisms > for doing that in SGML or XML). So what you are saying, effectively, is, why not add an extra layer, and use it on every access? I guess the simple answer is, I'd rather not. You are suggesting complicated solutions to something that was inherently simple to solve! Sure, we could have done all those things, and it would have dramatically increased the workload. We would have had to add that extra layer, or bring in additional technologies. Even from an abstract perspective, the solutions you are offering cannot, by any stretch of the imagination, be considered to fall under the "SGML does nothing that XML cannot do" premise. In reality they involve drawing on additional tools and technologies to make up a very real shortfall in XML's capabilities. This merely emphasises the fact that SGML and XML are _not_ the same thing. > Again, however, Marcello is writing about implementation details, > not about what SGML and XML are capable of representing in the > abstract. In this case, SGML makes life a little easier for a > *very* experienced designer under high-specialised circumstances. Actually, the tab-delimited stuff was one of the first problems I encountered when starting to use SGML. But such an answer would be something of a diversion. The real point is that SGML _is_ for experienced designers under highly specialised circumstances. If they aren't working under such circumstances, then by all means use XML (which is what most of our clients are, in fact, doing)! > Lexically, SGML and XML differ in minor ways; logically, they are > essentially identical. And I reiterate, implementation matters. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Mar 24 01:23:20 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:23 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) Message-ID: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson >You're modelling exactly the same information about the picture in >both -- data attributes provide an alternative mechanism for modelling >the information, but they do not allow you to represent anything that >you could not represent without them. Except that the SGML example gives the attributes as belonging to the thing pointed to (the entity) and not the particular invocation. In the XML version, there is nothing to say that the attributes belong to the thing pointed to rather than at the invocation. For example, take the common case of where the entity has a size (natural size) and the element also has size attributes (scaled size). Surely the equivalent XML to the SGML examples given is really: (and perhaps the photo element should not be simple link) The information modelled does not only include the elements and attributes but also the structure, and the fact that an entity is labelled as an entity, which have different addressing rules. In the absense of XML having conventions for the last three attributes, I dont think one can say that one can model everying that SGML models using XML. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Mar 24 02:04:26 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:10:23 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <36F62D81.A623C0A2@w3.org> Message-ID: <199903240153.AA00022@archlute.apsdc.ksp.fujixerox.co.jp> XML requires a draconian approach. 100% interoperability for conformant implementations is most important. Very low interoperability for non-conformant implementations is acceptable. As for HTML, users see corrupted documents when the browser chooses an incorrect encoding. Then, they can tell the correct encoding to the browser. Thus, it might not be a bad idea to provide 80% interoperability for conformant implementations and 50% interoperability for non-conformant implementations. The heuritics in HTML 4.0 is based on such an assumption, as I see it. As for XML, recipients of XML might be programs or database systems. In the worst case, corrupted documents will contaminate the entire database. A single XML document on the WWW may destroy XML-aware search engines. Hence, I believe that we need a draconian approach; we have to ensure 100% interoperability for conformant implementations. Ideally, it should be possible to point out non-conformant data and implementations. expat sometimes detects incorrect charsets. HTTP/1.1 quite clearly says that the charset parameter is authoritative. If RFC 2376 had said something different, interoperability for conformant implementations would have been destroyed. Chris Lilley wrote: > > > MURATA Makoto wrote: > > > > I believe that IE 5.0 does not conform to RFC2376 (XML Media Types), > > of which I am a co-author. > > > > As for the XML media type "text/xml", the charset parameter in the > > MIME header is authoritative. Encoding declarations have to be ignored > > so that transcoding is possible. > > So, if the file is saved to some local browser cache and then re-read, > it may have no MIME header so the encoding declaration is then > authoritative. The same thing applies to HTML. The cache must have MIME headers as well. > Why can't the transcoding proxy also rewrite the encoding declaration, > since it is rewriting the file anyway? It is trivially easy to find, > process, and change. For security reason, transcoding proxies should not rewrite documents. Moreover, if we mandate embedded encoding signatures for HTML, XML, CSS, etc., I18N of flat text will become impossible. I have believed that there is a conssensus in the W3C team and I am quite puzzled by your response. You might want to speak with Martin Duerst. > I imagine that someone could take some generic charset-converting code > and make a n XML-aware transcoding servlet that rewrote the encoding > declaration in about what, an hour? If someone does this, I will see > about getting it included in the next Jigsaw version. Please don't do that. > > However, IE 5.0 appears to always ignore the charset parameter and use > > the BOM or encoding declaration only. Therefore, IE 5.0 does not conform to > > RFC 2376. > > Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation? As Jon Cowan pointed out, it does not. > > When the charset parameter is not specified, it is assumed as US-ASCII. > > Wow. So, what this RFC says is that, when used in email and on HTTP, the > encoding declaration is *always ignored*. If the media type is text/xml, yes. As for application/xml, we use the procedure in Appendix F of XML 1.0. > That is a pretty big change and, frankly IMHO, ill-advised. Frankly, I am quite surprised that a W3C team member says such a thing in a public place after an RFC is published. Chris Lilley wrote: > > Correction: if you are the *administrator* of an Apache server. One of > the ways in which the Web has changed over the last 5 years is that the > percentage of Web authors who also administer the site that they serve > from has dropped from a substantial majority to an insignificant > minority. Are you aware of the "AddCharset" patch developed by W3C Keio? It allows casual users to configure Apache. Please concact Koga-san at W3C Keio (y-koga@ccs.mt.nec.co.jp). Chris Lilley wrote: > Please consider points 1 and 2 to be a defect report on RFC2376 These points are clearly in conflict with HTTP 1.1. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Mar 24 02:28:30 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:10:23 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <36F74B26.21CF46EC@w3.org> Message-ID: <199903240217.AA00023@archlute.apsdc.ksp.fujixerox.co.jp> Chris Lilley wrote: > > Unfortunately this is a side effect of the rules for the media type > > "text/*", which says that the default value of "charset" is always US-ASCII. > > The default rules if no other rule is in place for a specific Media > type. The registration for text/xml can overridfe this behaviour if it > wishes to. HTTP/1.1 (RFC2068 and the latest "Draft Standard") quite clearly says: The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. Here, the default is 8859-1 ;-( The latest I-D for RFC2376 also said that the default is 8859-1 when the XML document is being tramsmitted by HTTP. However, the IESG requested US-ASCII as the default. > > IESG discussed the document today that defines the text/xml media type. > > We note that it contines the practice of text/plain where the default > > charset is iso-8859-1 if transported over HTTP, but us-ascii if > > transported over SMTP. > > > > This inconsistency was a result of a wide deployment of HTTP > > implementations that did not properly following the MIME spec. > > Having one media type which is used inconsistently between HTTP > > and SMTP is bad enough, but we don't want to continue this practice > > for new media types. Inconsistencies between HTTP and SMTP > > usage make it more difficult to gateway between HTTP and email, > > or to use HTTP to access email contents. > > > > We suggest to have the charset parameter default to US-ASCII regardless > > of transport, and strongly recommend that the parameter always be > > supplied by senders. (If the sender is unsure whether the charset > > is US-ASCII or ISO-8859-1, it can safely label it as ISO-8859-1, > > since the former is a subset of the latter). Chris Lilley wrote: > So, in consequence: example file such as the Chinese XML examples at > http://xml.ascc.net/xml/test/index.html (where each example is available > in > UTF-8, Big5 and GB2312, all correctly labelled in the XML encoding > declaration) are now sets of invalid XML files which are required to > produce a critical error because of the invalid byte sequences in what > is now described as a US-ASCII file? Yes. Conformant XML parsers must report a fatal error. This is great since non-conformant data can always be detected. Examples of conformant XML documents are available at: http://www.fxis.co.jp/DMS/sgml/xml/charset/ Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Wed Mar 24 02:44:24 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:23 2004 Subject: DOM Implemetation in C? In-Reply-To: <004801be739f$4b5a5c80$17f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <000101be759e$f1c25540$0100007f@eps.inso.com> > There is a technical problem that CORBA IDL mappings do not > (as far as I can see) provide C mappings to let us know how to create > objects, but it seems that DOM (or, at least, DOM users) require object creation and > finalization. The DOM includes factory methods... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Wed Mar 24 02:44:29 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:23 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <005101be73a6$325a59e0$c8a8a8c0@thing1> Message-ID: <000201be759e$f318bd80$0100007f@eps.inso.com> > >Do we really need to know about CDATA sections > > Debatable perhaps, but supported by the DOM. (Anyone know why?) > But I'd really like to see better SAX/DOM integration, so Yes! CDATA sections *are* different from normal text, even if only because the author used them. Note the interface inheritance in the DOM that tries to hide the distinction for those that need not see it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crey at dcd.abk.nec.co.jp Wed Mar 24 02:51:44 1999 From: crey at dcd.abk.nec.co.jp (Charlemagne L. Rey) Date: Mon Jun 7 17:10:23 2004 Subject: null document value Message-ID: <36F852B4.770BA4D4@dcd.abk.nec.co.jp> I got a small problem which bothers me a lot. I'm trying to parse an XML file and access its objects using tagname through a DOMParser and yield a null value for the Document. As of now, I got no idea. I attached the codes as well as the xml and dtd file for you to help me find out why. -- Charlemagne L. Rey +81-0471-856713 +81-0471-838227 crey@software.ntep.nec.co.jp NEC Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: XMLDOMParser.java Type: application/x-unknown-content-type-java_auto_file Size: 1441 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990324/753eaab5/XMLDOMParser.bin -------------- next part -------------- babeth BABETH ABREA babeth@software null Y archie ARCHIE YAP archie@ntep.nec.co.jp Hardware Design Y abijay FRED ABIJAY abijay@software null N nolan NOLAN BATHAN nolan@software null Y ana ANACORINA M. CAVITE ana@software SW Design Clerk Y glennj GLENN OGAPONG glennj@software null Y jbarba JORGE BARBA jbarba@software null Y din JAMES DIN din@hardware.ntep.nec.co.jp Hardware Design Y petchie PETCHIE ABADINAS abadinas@ntep.nec.co.jp Ang guwapang tisay. Y msdiez MADONNA SALOME DIEZ msdiez@software null Y fecerico ELLAMAE CERICO fecerico@software null Y rachel RACHEL AGNES ALFAFARA rachel@software null Y felix FELIX CANTAY felix@software null Y mike MICHAEL CO MANABAT manabat@software The Dojo Master Y arvin ARVIN SAGARINO arvin@software null Y jdala JOEY DALA jdala@software null Y esolis EDUARDO SOLIS esolis@software null Y ariadne ALYWIN POSTRERO ariadne@software null Y bautista NEIL BAUTISTA bautista@software null Y nelanie NELANIE LUTH NIMIS nelanie@software null Y jleones JEREMY LEONES jleones@software null Y jlauta JENNIFER LAUTA jlauta@software null Y epalen EMILIO PALEN epalen@software null Y alvin ALVIN FERNANDEZ alvin@software null Y noel NOEL CHING ALLOSA noel@software.ntep.nec.co.jp null Y ronnelm RONNEL MAGLASANG ronnelm@software null Y baumelt BAUMEL TANDOGON baumelt@software null Y ali AZALEAH TIANERO ali@software null N giovanni GIOVANNI BAUTISTA giovanni@ntep.nec.co.jp null Y emeree EMEREE CLAIRE SANCHEZ emeree@software null Y ajuan JUAN F. ABULENCIA, JR. ajuan@software gwapo, uyab ni Cecille Y rubie RUBIE LIM rubie@software null Y joseph JOSEPH ONG joseph@ntep.nec.co.jp null Y donna DONNA MARIE FRADES donna@software null Y neil NEIL AGUILAR neil@ntep.nec.co.jp Production Engineering Dep't. N vrocales VICTOR ROCALES vrocales@software null Y nlongjas NOEMI LONGJAS nlongjas@software null Y josephus JOSEPHUS PESIRLA josephus@software null Y jlumapas JASON LUMAPAS jlumapas@software null Y alegaspi ANGELITO LEGASPI alegaspi@software null Y chingie CHINGIE TANCAWAN chingie@software null Y lotlot LUTHGARDA PAYLADO lotlot@software null Y gladys GLADYS ZALDIVAR gladys@software null Y andrew ANDREW LACAYA andrew@software null Y fred FRED KINTANAR fred@software Assistant Manager Y gbaguia GLICERIO BAGUIA gbaguia@software null Y may MARIA GERMAINE GERMAN may@software null Y oflas FELIPE JUN OFLAS JR. oflas@ntep.nec.co.jp Hardware Design Department Y dodgie DODGIE DANOSOS dans@ntep.nec.co.jp Hardware Design Y jopee JOPEE CAMIGUE jopee@ntep.nec.co.jp Hardware Design Clerk Y guday NOEL GUDAY guday@ntep.nec.co.jp Hardware Design Y dondon DONDON MATARANAS dondon@ntep.nec.co.jp Hardware Design Department Y tina CRISTINA D. FAELDONEA cristy@ntep.nec.co.jp Hardware Design Clerk Y inad IRWIN ENAD inad@ntep.nec.co.jp Hardware Design Department Y mar MAR ARES mar@ntep.nec.co.jp Hardware Design Y ars RAYMUND ARCILLA ars@ntep.nec.co.jp Hardware Design Y nelson NELSON BRIONES nelson@ntep.nec.co.jp Hardware Design Department Y reneb RENE BAJARIAS reneb@ntep.nec.co.jp EDP Y mojal LEOPOLDO MOJAL mojal@ntep.nec.co.jp Hardware Design Y orson ORSON YU orson@ntep.nec.co.jp Hardware Design Department Y allan ALLAN FABIANA alanmf@ntep.nec.co.jp Production Control Y romy ROMEO HEYRANA romy@ntep.nec.co.jp Hardware Design Y maluenda ANTONIO MALUENDA maluenda@ntep.nec.co.jp null Y john JOHN DEXTER OMOLON john@software null Y robert ROBERT DALE MONTESCLAROS robert@software Intake date: 11/11/96 Y imaceda IVAN MACEDA imaceda@software Intake date: 11/12/96 Y pierre PIERRE ENRIQUEZ pierre@software Wala ko'y sure sa iyang e-mail Y jeff JEFFREY ALBARRACIN jeff@ntep.nec.co.jp Hardware Design Department Y salazar RICHARD SALAZAR salazar@ntep.nec.co.jp Hardware Design Y jerry JERMANDO RODRIGUEZ jerry@software null Y nonon LEANDRO FAELDONEA nonon@ntep.nec.co.jp Hardware Design Department Y markh MARK AGUILAR markh@hardware.ntep.nec.co.jp Hardware Design Y rsardon RINO SARDON rsardon@ntep.nec.co.jp Hardware Design Y chris CHRISTOPHER IDO chris@ntep.nec.co.jp Production Control Dep't. Y edward EDWARD edward@ntep.nec.co.jp Manufacturing Office N bobby ROMEO LONOY bobby@ntep.nec.co.jp Hardware Design Y jlim JOEL LIM jlim@software Intake date: April 7, 1997 Y idongon IKE DONGON idongon@software Intake date: April 7, 1997 Y jgumaroy JONATHAN GUMAROY jgumaroy@software intake date: April 7, 1997 Y etaladua EARL WILLIAM KHO TALADUA etaladua@software intake date: April 7, 1997 Y troldan ROLDAN TORIBIO troldan@software intake date: April 7, 1997 Y vjangus VICTOR JESUS ANGUS vjangus@software intake date: April 7, 1997 Y johnd JOHN JUN DORMITORIO johnd@software intake date: April 7, 1997 Y cabella CHRISTIAN ABELLA cabella@software intake date: April 7, 1997 Y henrison HENRISON SIA henrison@software intake date: April 7, 1997 Y sdeanna DEANNA SALMERON sdeanna@software intake date: April 7, 1997 Y njclark JOHN CLARK NALDOZA njclark@software intake date: April 7, 1997 Y ericp ERIC PIZON ericp@software.ntep.nec.co.jp intake date: April 7, 1997 Y mmedina MICHELLE MEDINA mmedina@software Intake date: April 7, 1997 Y josephb JOSEPH BENAVIDES josephb@software Intake date: April 7, 1997 Y jecal JOSE ELIAS CALDERON jecal@software Intake date: April 7, 1997 Y rvchua R. VICTORIA CHUA rvchua@ntep.nec.co.jp QC Engineer Y crey CHARLEMAGNE REY crey@software Intake date: April 7, 1997 Y gboston GERSON BOSTON gboston@ntep.nec.co.jp Hardware Design Engineer Y benl BENJAMIN LAHOY benl@ntep.nec.co.jp null Y ivy IVY PASCUA ipascua@ntep.nec.co.jp null N russel RUSSEL OBERIO roberio@software Intake date: November 4, 1997 Y acreer ALLAN CREER acreer@software Intake date: Nov. 4, 1997 Y ken KEN ERICK SARMAGO ken@software Intake date: Nov. 4, 1997 Y tanj JERALDINE TAN tanj@software.ntep.nec.co.jp Intake date: 11/11/97 Y garry GARRY M. (PC) na PC N beths MARIBETH SUICO beths@ntep.nec.co.jp Finance Department Y puppy EDMUND MOLINA emolina@ntep.nec.co.jp PE Department Y vincent VINCENT LADLAD vincentl@software Intake date: May 4, 1998 Y jun LEONARDO ARTIAGA JR. leonardo@software Intake date: May 4, 1998 Y grace GRACE CAGULANGAN gracec@software Intake date: May 4, 1998 Y jay JAY CAMINERO jayc@software Intake date: May 4, 1998 Y rcamus RIZZA CAMUS rcamus@software Intake date: May 4, 1998 Y cvanessa VANESSA CASTILLO cvanessa@software Intake date: May 4, 1998 Y ccayacap CELESTE CAYACAP ccayacap@software Intake date: May 4, 1998 Y mjdultra MARK ELJUNNE DULTRA mjdultra@software Intake date: May 4, 1998 Y fendaya FRANKLIN ENDAYA fendaya@software Intake date: May 4, 1998 Y nickson NICKSON IAN LEGASPI nicksonl@software Intake date: May 4, 1998 Y william WILLIAM GEORGE GO william@software Intake date: May 4, 1998 Y adrianr ADRIAN P. RESTAURO adrianr@software Intake date: May 4, 1998 Y charles CHARLES TORREJOS tcharles@software Intake date: May 4, 1998 Y jlee JENNYLEE UY jlee@software.ntep.nec.co.jp Intake date: May 4, 1998 Y emmanuel EMMANUEL VILLACERAN emmanv@software Intake date: May 4, 1998 Y sherwin SHERWIN GARCIA garcia@ntep.nec.co.jp MFG. Department N georgec GEORGE L. CORDERO georgec@software June 3, 1998 Y jubalde JERICO UBALDE jubalde@mailman.ntep.nec.co.jp Hardware Design Department Y fretsie FRETSIE SALAZAR fsalazar@ntep.nec.co.jp Production Engineering Dep't. Y reina HEIDI RAMAS reina@ntep.nec.co.jp EDP Y caudie CAUDI DISCIPULO caudie@ntep.nec.co.jp EDP Manager Y jmocol JOSEPH MARIE OCOL jmocol@ntps10 Intake date: October 1, 1998 Y llim LEO C. LIM llim@ntps10 Intake date: October 1, 1998 Y jjambata JAY JOHN AMBATA jjambata@ntps10 Intake date: November 6, 1998 Y daparici DEXTER APARICIO daparici@ntps10 Intake date: November 6, 1998 Y mguibone MIRIAM GUIBONE mguibone@ntps10 Intake date: November 06, 1998 Y jpllosa JOEL PATRICK LLOSA jpllosa@ntps10 Intake date: November 6, 1998 Y lamartin LILY ANN MARTIN lamartin@ntps0504 Intake date: November 6, 1998 Y mdmejora MARK DAVE MEJORADA mdmejora@ntps0504 Intake date: November 6, 1998 Y rroleda RYAN ROLEDA rroleda@ntps10 Intake date: November 6, 1998 Y christine CHRISTINE PENA chetpena@hotmail.com null Y rita RITA C. DEBALUCOS rita@software.ntep.nec.co.jp Software Design Clerk N -------------- next part -------------- From cbullard at hiwaay.net Wed Mar 24 04:08:06 1999 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 17:10:24 2004 Subject: Validation References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F15372.3FF36ABB@prescod.net> <19990321133612.D29582@io.mds.rmit.edu.au> Message-ID: <36F86472.7D1D@hiwaay.net> Marcelo Cantos wrote: > > Of course, none of the above discourse will eliminate the need for > discussion on what, exactly, is needed and how that need is to be > satisfied. As one colleague astutely pointed out to me, I am really > transforming the issue from "real validation" to "sufficient > validation". It would be a mistake, however, to conclude that this is > a trivial transformation in the statement of the problem. It diverts > the emphasis of the search markedly away from completeness and towards > practicality and useability (of course, completeness remains > desirable, it merely ceases to be a central goal). Not in disagreement. Still, DTDs play a role in expressing constraints that in some way, must be implementable and to some degree must be validated for a particular piece of content. Here is a different kind of schema from the VRML language. How would any/all of the DTDs/schemas proposed for XML be used to define this? Which if any are better? Transform { eventIn MFNode addChildren eventIn MFNode removeChildren exposedField SFVec3f center 0 0 0 exposedField MFNode children [ ] exposedField SFRotation 0 0 0 1 0 exposedField SFVec3f scale 1 1 1 exposedField SFRotation scaleOrientation 0 0 1 0 exposedField SFVec3f translation 0 0 0 field SFVec3f bboxCenter 0 0 0 field SFVec3f bboxSize -1 -1 -1 } It isn't a trick question. Serious people are currently evaluating the suitability of XML for this. In this form, the declaration is quite compact. Right now, without datatypes and without event models, this *simple* node may be more than XML can describe. Any takers? len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Mar 24 04:29:28 1999 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 17:10:24 2004 Subject: XML complexity, namespaces (was WG) References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> <36F70005.3D1F76D2@allette.com.au> <14071.29413.900398.832442@localhost.localdomain> Message-ID: <36F86974.5593@hiwaay.net> David Megginson wrote: > > As for standards bodies, I don't know. Perhaps XML will eventually > migrate to an Internation Standards body of some sort -- who knows if > the W3C will even exist in five years? -- or (and this might be > preferable) the torch will pass to a new, better-constituted body that > takes over both the W3C and IETF standards. I think also that trend is already in motion as evidenced by the formal working agreements between ISO and various consortia including the W3C and Web3D. The ISO VRML97 standard started as a consortium standard which when mature enough and for which working implementations could be demonstrated reliably, was forwarded to ISO for international standardization. That is a very healthy way to do this business. The W3C can stick closer to its charter of promoting technologies and specifications and spend less time on *standardization*. This is not to say the W3C work is not worthy, but the focus of standardization often has legal tangles. When engineers practice law, you get poor law. When lawyers engineer, airplanes fall out of the sky. Its a matter of practice and focus. The working agreements are like a wheel inside a wheel. The inner wheels (the consortia) can turn fast. The outer wheel (ISO) turns slower. In concert, events are notated smoothly. XML won't supplant SGML. It won't have to. The same people I met building SGML are building XML. The community matters. The specs and standards are what we implement and agree on. Nothing more. I miss Yuri. He understood that. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Wed Mar 24 08:34:11 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:10:24 2004 Subject: XML conference Message-ID: <360209e3.240299@smtpgate1.ONE2ONE.CO.UK> Hi all, I am seeking info on the 'XML One' conference planned for May 24 at Austin, TX . The web is under construction so details are very limited. Is anyone on this list intend on doing something interesting at the conference? adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Wed Mar 24 10:39:05 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:10:24 2004 Subject: XML conference Message-ID: <36020bcc.240299@smtpgate1.ONE2ONE.CO.UK> There seems to be confusion about my previous email. I'll attempt to rephrase my questions. I am considering attending the 'XML One' conference planned for May 24 at Austin, TX. I am searching for details so i can convince my manager to pay for the conference fees. However the official conference web site is under construction. I guessed that most of the speakers would come from this interest group. So if anyone has info about the conference please let me know. Thanks -lucio > Hi all, > I am seeking info on the 'XML One' conference planned for May 24 at > Austin, TX . The web is under construction so details are > very limited. > Is anyone on this list intend on doing something interesting at the > conference? > > > adios > > -lucio > > --------------------------------------------------------------------- > One2One LUCIO.PICCOLI@one2one.co.uk > Elstree Tower tel : +44 181 214 3847 > Elstree Way > Borehamwood fax :+44 181 214 2325 > LONDON WD6 1DT > __________ http://www.one2one.co.uk _____________ > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 24 13:59:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:24 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) In-Reply-To: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU> References: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14072.59101.373961.418429@localhost.localdomain> Rick Jelliffe writes: > Surely the equivalent XML to the SGML examples given is really: > > > height="200" > depth="16" > type="I point to some object called an NDATA entity" > content-model="I must be empty" > addressing="don't count me as an element when doing treeloc" /> > > > > (and perhaps the photo element should not be simple link) > > The information modelled does not only include the elements and > attributes but also the structure, and the fact that an entity is > labelled as an entity, which have different addressing rules. In the > absense of XML having conventions for the last three attributes, I dont > think one can say that one can model everying that SGML models using > XML. Rick, you're still pointing to implementation details rather than abstract modelling. Try to express the question in terms of the thing being modelled -- for example, at a project meeting, the system architect might ask the following question: Can SGML and XML both model a reference to a photograph, providing the width, height, and colour depth? The answer, of course, is 'yes'. At this point, the system architect drops out of the discussion and starts playing Tetris on her Palm Pilot. Next, someone asks whether there's a substantial difference in the time and cost for implementation and maintenance. Both SGML and XML can declare the object in a single place as an external NDATA entity or upon each reference as an HREF attribute, and both SGML and XML can provide the type information explicitly in a single place through a notation or upon each reference through a MIMETYPE attribute, or allow the application to determine the type through the transfer protocol (i.e. HTTP), file extension, magic patterns at the start, etc. However, the SGML guru points out that information about the graphic's size (if needed) can be expressed in SGML in a single place using data attributes when the entity is declared, while in XML it needs to be repeated in attributes for each reference. The XML zealot mentions that you could use a single XML element to model the photograph as an independent object, and a short and confusing debate ensues with the XML specialist and a couple of data-modelling specialists taking up one side, and the SGML guru and a couple of document-modelling specialists taking up the other, while the rest of the room falls into a stupor. Suddenly, the project manager jolts himself awake and asks what the disadvantage is to giving the size information for each reference rather than once in the declaration, and whether doing so will delay the project or cause serious headaches when the project migrates to V2 in the fall. The SGML guru declares that it's always better to maintain the information in a single place rather than repeating it, because if the information changes, it's all in one place and can be accessed easily. The XML zealot argues that you can do the same thing by treating the picture as a first-class object, modelled with elements. The SGML zealot cuts in and says that that will mess up HyTime addressing, and at the mention of the word 'HyTime' a sudden panic grips the room, until the XML zealot kindly points out that the same thing might apply to XPointer. There is a second, brief religious war between the SGML guru and the XML zealot about whether the photograph is a declaration that belongs in the prolog or an object that should be modelled in the document element (again, the data-modelling specialists and the document-modelling specialists take sides), but the meeting is going on too long and it's becoming obvious to everyone except the SGML and XML people that the differences aren't really important enough to have a measurable effect on the project. Just to be certain, though, the project manager cuts off the debate by asking how often the same picture will appear in a single document. The graphic designer mentions repeated graphical elements like specialised bullets, icons in the page headers, etc., but the SGML guru and the XML zealot both shout her down by saying that that kind of thing is handled by the stylesheet. The system architect (who has finished Tetris) then declares that information about the photograph's size, type, etc. should be stored in a relational database, where it can be easily maintained and updated, and that the SGML or XML will simply contain a unique identifier that can be used to generate a primary key for a database lookup. Every else at the meeting except for the markup specialists nods vigorous agreement, the meeting breaks up, and they all rush for the coffee machine or washrooms, except for the SGML guru and the XML zealot -- they stay at the table, arguing whether the unique identifier for the database lookup should be a formal public identifier or a URI..... All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 24 14:00:10 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:24 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <000201be759e$f318bd80$0100007f@eps.inso.com> References: <005101be73a6$325a59e0$c8a8a8c0@thing1> <000201be759e$f318bd80$0100007f@eps.inso.com> Message-ID: <14072.61385.105692.234306@localhost.localdomain> Gavin Thomas Nicol writes: > > >Do we really need to know about CDATA sections > > > > Debatable perhaps, but supported by the DOM. (Anyone know why?) > > But I'd really like to see better SAX/DOM integration, so Yes! > > CDATA sections *are* different from normal text, even if only > because the author used them. Note the interface inheritance in > the DOM that tries to hide the distinction for those that need > not see it. By the same argument,

and

are different, because the author used them. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eriblair at mediom.qc.ca Wed Mar 24 14:23:44 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:10:24 2004 Subject: How to convert an XML file to an Access database ... Message-ID: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca> Hello, I would like to know the simplest way to import the information contained in a file XML to an Access database... Thank you for your answers, Regards, ?ric -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990324/eccc1e1e/attachment.htm From jatkins at Bluestone.com Wed Mar 24 15:59:44 1999 From: jatkins at Bluestone.com (Atkins, Jon) Date: Mon Jun 7 17:10:24 2004 Subject: XML conference Message-ID: <9A4DF69E3C5ED211B86400A0C9D1776095F0E2@thor.operations.bluestone.com> XML One is being positioned as a comprehensive XML soultions and technology forum with 3 tracks and 30 sessions on the latest XML technology. Bob Bickel, Sr. Vice President Products for Bluestone Software,Inc., will be giving a presentation entitled: Developing and deploying applications with Dynamic XML Servers. The presentation will take place on 5/27/99 at 4:00 pm and will look at what a dynamic XML server is and how to develop and deploy applications. ----Original Message----- From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk] Sent: Wednesday, March 24, 1999 3:29 AM To: xml-dev@ic.ac.uk Subject: XML conference Hi all, I am seeking info on the 'XML One' conference planned for May 24 at Austin, TX . The web is under construction so details are very limited. Is anyone on this list intend on doing something interesting at the conference? adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 24 16:46:57 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:24 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <001901be7616$8e4caba0$c8a8a8c0@thing1> From: Gavin Thomas Nicol >CDATA sections *are* different from normal text, even if only >because the author used them. Again, is anyone aware of why CDATA is preserved by the DOM? What was the reasoning behind this decision? Other things, like whitespace within an element tag or even attribute order, are not preserved. Why then was CDATA? Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 24 16:51:40 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:24 2004 Subject: DOM CDATA vs Normalization Message-ID: <002001be7617$2fc33440$c8a8a8c0@thing1> Normalization of an element combines various text objects into a single text object. Does it then merge text and CDATA objects to a single object? And what about ignorable whitespace? Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From BPosert at filenet.com Wed Mar 24 17:08:15 1999 From: BPosert at filenet.com (Posert, Bob) Date: Mon Jun 7 17:10:24 2004 Subject: SQL queries expressed in XML Message-ID: You might want to take a look at the WebDAV-related DASL "Distributed Searching And Locating" working group page at http://www.ics.uci.edu/pub/ietf/dasl >From their charter: Working Group Scope A generalized search mechanism is a broad problem space. It encompasses a variety of object models, typing schemes, and media. By focusing on a subset of this space, the problem of locating resources based on property values and text content, the working group will leverage much of the existing work that has been done on querying under simple property and resource models. In-Scope items include: - typing - comparisons (>, >=, <, <=, !=, ==) - internationalized content - text content matching - dealing with arbitrary XML values Out-of-scope items include: - definitions of well-known properties - server-to-server communication protocols - cross-language comparisons - searching for non-text content (images, video, audio, etc.) - client control of server administration (e.g. indexing) --Bob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Wed Mar 24 17:10:28 1999 From: paul.janssens at skynet.be (JPA) Date: Mon Jun 7 17:10:24 2004 Subject: XML convertor generator Message-ID: <36F91BC9.4310@skynet.be> Hello, I'm currently working on an xml convertor-generator. When finished, the tool will, if you take the bother to type the structure of your input format and mappings on entities and attributes, construct a convertor. There's no documentation as yet, and some stuff missing (escaping, for one thing), but if there's enough interest I'll put it on a website as is. Paul Janssens - paul.janssens@skynet.be Here's an example of what the tool actually does: 1) sample input (Your legacy data here) ACCEPT x FROM y ACCEPT t FROM z END_ACCEPT ACCEPT t FROM z AT LINE 23 ACCEPT t FROM z AT COLUMN NUMBER 5 ACCEPT t FROM z AT COLUMN NUMBER col ACCEPT t FROM z ON EXCEPTION BUMMER ACCEPT t FROM z NOT ON EXCEPTION OK 2) syntax file (The stuff the user has to type.) TOKEN identifier '[A-Za-z_][A-Za-z_0-9]*'; TOKEN number '[+-]?[0-9]+'; acceptstatements: ( ACCEPT (identifier % )/acceptdestination (FROM identifier % )/acceptsource ? ( AT (!LINE|!COL|!COLUMN) # measure NUMBER? (identifier|number) % )/acceptposition ? ? ? END_ACCEPT? )/acceptstatement *; onexception: (ON EXCEPTION BUMMER); notonexception: (NOT ON EXCEPTION OK); 3) convertor output x y t z t z 23 t z 5 t z col t z t z xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 24 17:32:22 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:24 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de> Bill la Forge wrote: > From: Gavin Thomas Nicol > >CDATA sections *are* different from normal text, even if only > >because the author used them. > > Again, is anyone aware of why CDATA is preserved by the DOM? > What was the reasoning behind this decision? Other things, like > whitespace within an element tag or even attribute order, are not preserved. > Why then was CDATA? I can't say why the DOM included CDATA, but I'll hazard a guess and agree with Gavin. If I'm using a CDATA section, it means that I really, really, really don't want what's in the section to be parsed and it would be a royal pain for me if it was. (Think about writing an HTML tutorial.) The obvious place where preservation of CDATA is important, then, is when I'm co-authoring a document with a friend who uses a DOM-based editor while I prefer a text editor. If every time my friend edits the document all the CDATA sections get wiped out, neither our friendship nor our co-authorship are going to last very long. This is quite different from whitespace in element tags and attribute order, which are more aesthetic concerns than practical ones. I might be a bit annoyed if my friends editor rearranges these, but I am unlikely to go looking for new partners because of it. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Wed Mar 24 17:47:42 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 References: <001901be7616$8e4caba0$c8a8a8c0@thing1> Message-ID: <36F9250B.B4F747E0@sqwest.bc.ca> Bill la Forge wrote: > > From: Gavin Thomas Nicol > >CDATA sections *are* different from normal text, even if only > >because the author used them. > > Again, is anyone aware of why CDATA is preserved by the DOM? > What was the reasoning behind this decision? Gavin summed it up quite well - the author used a CDATA Section and may have attached some semantic meaning to it (I know that several people disagree that CDATA sections can have semantic meaning; others think they can) so the DOM doesn't throw away that distinction, just in case. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Wed Mar 24 17:52:01 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:10:25 2004 Subject: DOM CDATA vs Normalization References: <002001be7617$2fc33440$c8a8a8c0@thing1> Message-ID: <36F92629.1D989E9E@sqwest.bc.ca> Bill la Forge wrote: > > Normalization of an element combines various text objects into a single > text object. Does it then merge text and CDATA objects to a single object? > And what about ignorable whitespace? Normalization merges only adjoining Text nodes, regardless of their content. It does not merge Text nodes with CDATA Section nodes, comments or PIs. You will notice in the latest draft of the DOM Level 2, at http://www.w3.org/TR/WD-DOM-Level-2/, that one of the items on the list of issues to be addressed in Level 2 is Conversion of a CDATASection node to a TEXT node. This could include merging adjacent Text and CDATA Section nodes. Might I suggest this discussion take place on the public DOM mailing list? www-dom@w3.org, to subscribe send email to www-dom-request@w3.org with the subject line "subscribe". regards, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Wed Mar 24 17:55:09 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 References: <005101be73a6$325a59e0$c8a8a8c0@thing1> <000201be759e$f318bd80$0100007f@eps.inso.com> <14072.61385.105692.234306@localhost.localdomain> Message-ID: <36F926D5.D1CEE889@sqwest.bc.ca> David Megginson wrote: > > Gavin Thomas Nicol writes: > > > > >Do we really need to know about CDATA sections > > > > > > Debatable perhaps, but supported by the DOM. (Anyone know why?) > > > But I'd really like to see better SAX/DOM integration, so Yes! > > > > CDATA sections *are* different from normal text, even if only > > because the author used them. Note the interface inheritance in > > the DOM that tries to hide the distinction for those that need > > not see it. > > By the same argument, > >

x="1"> > > and > >

> > are different, because the author used them. I haven't heard anyone argue that the whitespace can have semantic meaning, whereas I have heard it about CDATA sections. (Note that I do not necessarily agree that CDATA sections can have semantic meaning, simply that some people think they do.) Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Mar 24 18:14:34 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <00dd01be7622$13368ba0$2ee044c6@arcot-main> >The problem is that even if you don't care about entity boundaries, >the XML 1.0 REC requires reporting of any entities that are not >expanded (in the case, for example, of a non-validating parser that >hasn't read the declaration in the external DTD subset). As a result, >in a literal reading of the spec, a fully-conformant XML 1.0 API can >*never* treat attribute values simply as strings. SAX 1.0 does so, >and no one has ever minded, but conformance is conformance... The XML REC uses the word 'report' a lot but wisely does get into what reporting means. I think that as long as the information is available on-demand through one mechanism or another, we can consider the reporting requirement met. Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 24 18:21:15 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:25 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,namespaces (was WG)) References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU> <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> <14071.33102.817935.724149@localhost.localdomain> Message-ID: <36F91A38.53F4C78@prescod.net> David Megginson wrote: > > Sean Mc Grath writes: > > > This is an interesting thread. Many non-tag-minimization > > reliables can be put forth as things that SGML "can do" that > > XML cannot. Things like data attributes, exclusion exceptions, > > internal SDATA entities and so on. > > I think that I agree with what Sean is saying here and later in the > message -- think of *what* you can represent rather than *how* you > represent it. That representation alone isn't good enough -- standardization is also important. Here's what I heard Sean saying: * SGML favours globally standardized declarations over locally maintained custom code. * XML restricts the number of globally standardized declarations in favor of locally maintained custom code. In other words: SGML favours standarization and XML favours one-off system-specific ad-hocery. If I really believed that then I would drop XML and advise my customers not to use it. XML removed certain specific declarative features of SGML that were either not used enough or could be added in at another level. But little by little XML is becoming more and ore declarative through other layers like XLink, XSL, RDF and XML Schemas. The move towards declarativeness and away from ad hoc code is precisely XML's gift to the Web. Standardized declarativeness is the real XML revolution. XML just happens to be the syntax. Let me demonstrate that XML is also standard declaration-focused by turning around the "notations" example. The SGML way was to declare a notation and have a second level validate that the data adhered to the notation. Unfortunately, we never really standardized a decent declarative syntax for the second level. In other words, SGML was not declarative *enough*. XML, on the other hand, will likely have a mechanism where notations can be declared in the schema (under the title of "user-defined data types"). So the XML family will have a more powerful, standardized, declarative mechanism which will reduce the need for maintaining custom code. The declarativeness baton has been passed from SGML to XML. Custom code is the enemy. We will always need it but we must continue to relegate it to more and more complex or esoteric problems. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Mar 24 18:31:15 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <001401be7624$668dc5a0$2ee044c6@arcot-main> >The XML REC uses the word 'report' a lot but wisely does get into what >reporting means. I think that as long as the information is available >on-demand through one mechanism or another, we can consider the reporting >requirement met. OOPS. I meant to say that REC does NOT explain what reporting means. Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 24 18:40:05 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de> Lauren Wood wrote: > Gavin summed it up quite well - the author used a CDATA Section and > may have attached some semantic meaning to it (I know that several > people disagree that CDATA sections can have semantic meaning; > others think they can) so the DOM doesn't throw away that > distinction, just in case. I'm having trouble imagining how a CDATA section can have semantic meaning in all but the most abusive ways. (Hmmm, there's a CDATA section. Fire up the pizza delivery DLL.) Could you give an example? Thanks. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 24 19:27:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:25 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <001901be7616$8e4caba0$c8a8a8c0@thing1> References: <001901be7616$8e4caba0$c8a8a8c0@thing1> Message-ID: <14073.15473.885207.805699@localhost.localdomain> Bill la Forge writes: > Again, is anyone aware of why CDATA is preserved by the DOM? > What was the reasoning behind this decision? Other things, like > whitespace within an element tag or even attribute order, are not preserved. > Why then was CDATA? I would guess that the DOM WG believed that users of XML editors and repositories would want to see CDATA section boundaries and comments survive a round trip in and out of the tools. Personally, I am extremely skeptical, but I have heard this argument many times from the employees of the vendors themselves. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 24 19:37:28 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:25 2004 Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1) In-Reply-To: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de> References: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de> Message-ID: <14073.15571.764196.447559@localhost.localdomain> Ronald Bourret writes: > The obvious place where preservation of CDATA is important, then, > is when I'm co-authoring a document with a friend who uses a > DOM-based editor while I prefer a text editor. If every time my > friend edits the document all the CDATA sections get wiped out, > neither our friendship nor our co-authorship are going to last very > long. Yes, but there would be easier ways to handle this. Let's say, for example, that you consistently use the following in your text editor: This is literal XML markup used as an example ]]> Now, if CDATA boundaries were discarded, when your friend loaded this into her DOM-based editor and then saved it again, you would see something like the following: <s>This is literal XML markup used as an example.</s> If this kind of thing does matter (as it probably would to you), perhaps your friend could configure her editor to select certain element types that would always have their content CDATA escaped on export (nearly every document type has only one or two candidates, such as HTML

).

Even if your friend's editor didn't support that, nearly anyone on
this list could hack together a Perl or Java program in about 15
minutes that you allow you to do something like

  xml-cdata-escape mydoc.xml example > mydoc2.xml

Voila, your CDATA is back!  Of course, there are a few situations
where people use CDATA less predictably, but I hardly believe that the
requirement would survive a real cost-benefit analysis if the DOM WG
had made one.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 19:40:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <00dd01be7622$13368ba0$2ee044c6@arcot-main>
References: <00dd01be7622$13368ba0$2ee044c6@arcot-main>
Message-ID: <14073.16162.923732.438041@localhost.localdomain>

Don Park writes:

 > >The problem is that even if you don't care about entity boundaries,
 > >the XML 1.0 REC requires reporting of any entities that are not
 > >expanded (in the case, for example, of a non-validating parser that
 > >hasn't read the declaration in the external DTD subset).  As a result,
 > >in a literal reading of the spec, a fully-conformant XML 1.0 API can
 > >*never* treat attribute values simply as strings.  SAX 1.0 does so,
 > >and no one has ever minded, but conformance is conformance...
 > 
 > The XML REC uses the word 'report' a lot but wisely does get into what
 > reporting means.  I think that as long as the information is available
 > on-demand through one mechanism or another, we can consider the reporting
 > requirement met.

Yes, I agree -- we *can* provide the attribute value as a string, but
we also have to make the alternative representation available in case
in 10 or 20 years someone actually needs it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Wed Mar 24 19:48:17 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <3.0.32.19990324134646.00d152cc@pophost.arbortext.com>

At 14:28 1999 03 24 -0500, David Megginson wrote:
>Bill la Forge writes:
> > Again, is anyone aware of why CDATA is preserved by the DOM?
> > What was the reasoning behind this decision? Other things, like
> > whitespace within an element tag or even attribute order, are not
preserved.
> > Why then was CDATA? 
>
>I would guess that the DOM WG believed that users of XML editors and
>repositories would want to see CDATA section boundaries and comments
>survive a round trip in and out of the tools.  Personally, I am
>extremely skeptical, but I have heard this argument many times from
>the employees of the vendors themselves.

As such a vendor, I hear this from our customers.  

When authoring a document, the user may want to know there
is a region into which s/he can paste stuff containing < and & 
characters and know they won't be interpreted as markup.  True,
the editing application can magically escape them (e.g., <)
as part of the paste operation, but what if the user is using
Notepad to copy a parsable XML example into an XML document? 
Having to escape the special characters destroys the ability
to have that data remain parsable/validatable at the same time
as embedded in the larger document, and that destroys an important 
reuse/multipurpose feature otherwise available in XML.  (Think
of a dynamic XML document that allows you to "verify as well-formed"
the content of any  element in your tutorial document.)
 
The point is that the user-author inserted the CDATA section for 
a reason, and they might well want it to stay there.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 20:04:30 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
Message-ID: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>

David Megginson writes:

> If this kind of thing does matter (as it probably would to you),
> perhaps your friend could configure her editor to select certain
> element types that would always have their content CDATA escaped on
> export (nearly every document type has only one or two candidates,
> such as HTML 
).
>
> Even if your friend's editor didn't support that, nearly anyone on
> this list could hack together a Perl or Java program in about 15
> minutes that you allow you to do something like
>
>   xml-cdata-escape mydoc.xml example > mydoc2.xml

I buy the first argument (seems like a reasonable feature of an XML editor) 
but not the second. Most users of such systems are unlikely to be able to 
hack together such a program and they may or may not have a friendly 
programmer to whom they can turn.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 20:33:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
In-Reply-To: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>
References: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>
Message-ID: <14073.19387.229257.98559@localhost.localdomain>

Ronald Bourret writes:

 [snip]

 > > Even if your friend's editor didn't support that, nearly anyone
 > > on this list could hack together a Perl or Java program in about
 > > 15 minutes that you allow you to do something like
 > >
 > >   xml-cdata-escape mydoc.xml example > mydoc2.xml
 > 
 > I buy the first argument (seems like a reasonable feature of an XML
 > editor) but not the second. Most users of such systems are unlikely
 > to be able to hack together such a program and they may or may not
 > have a friendly programmer to whom they can turn.

Ah, yes, but people who edit their XML in a text editor probably would 
be capable of starting a command-line application.  What I'm
suggesting is that *something* had to be written -- either the simple
filter or full CDATA support for all DOM applications.  Full CDATA
support won (love it or leave it), but the other might have been a
little easier.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 21:14:20 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
Message-ID: <01BE7643.888FE290@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> Ah, yes, but people who edit their XML in a text editor probably would
> be capable of starting a command-line application.  What I'm
> suggesting is that *something* had to be written -- either the simple
> filter or full CDATA support for all DOM applications.  Full CDATA
> support won (love it or leave it), but the other might have been a
> little easier.

Hmmm.  If those are the choices, I vote for putting it in.  Forcing all 
companies to write a quick little application and forcing all users to run 
a quick little application seems far more onerous than forcing DOM 
programmers to work around this.

I wasn't even going to reply, but then I remembered that the real question 
here is whether SAX (not the DOM) should tell people about CDATA sections. 
 I think the answer is yes.  Unlike the DOM, where people not interested in 
CDATA sections still have to work around them, SAX applications that are 
not interested in CDATA sections simply have null implementations of 
start/endCDATA.

The only drawback I see is that applications not interested in CDATA 
sections are forced to suffer through three calls to 
DocumentHandler.character -- before, during, and after the CDATA section. 
 The application can use a filter to solve this, of course, but it's still 
likely to be a source of application errors.  (Depending on how parsers 
implement LexicalHandler callbacks, this could happen even if the 
application doesn't register a LexicalHandler implementation.  Does the 
property requesting a single call to character() apply in this case?  It 
ought to.)

-- Ron


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 22:02:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: XML and (K)Office
Message-ID: <14073.23919.284210.386065@localhost.localdomain>

Now that the press is (prematurely) declaring Microsoft's imminent
demise [1], perhaps we can stop worrying about MS Office's XML support
(why hitch your cart to an allegedly dying horse?) and look at Linux.

For those of you who don't know, the current incarnation of the
emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,
so it is with mixed feelings that I draw attention to the fact that
KOffice for KDE (see article [2]) uses XML-based save formats for
*all* of its applications (word processor, spreadsheet, formula
designer, presentation manager, etc. etc.).

To be fair, the main Gnome spreadsheet also uses a (no doubt
incompatible) XML-based save format.

There's also a hot rumour [3] that Microsoft has assigned 37
programmers to work on a Linux port of MS Office.

The place to go for this kind of stuff is slashdot.org, which is heavy 
on the hacker look-and-feel.


All the best,


David

[1] http://www.cnn.com/TECH/computing/9903/24/mslinux.html/ (and many
    others) 
[2] http://www.mieterra.com/article/koffice.html
[3] http://www.heise.de/newsticker/data/cp-19.03.99-000/ (auf Deutsch)

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Mar 24 22:12:26 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:25 2004
Subject: How to convert an XML file to an Access database ...
In-Reply-To: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca>
Message-ID: <4.1.19990325090956.00bda4c0@steptwo.com.au>

At 00:27 25/03/1999 , ?ric Riblair wrote: 
>
> Hello, 
>
> I would like to know the simplest way to import the information contained in
> a file XML to an Access database... 
>
> Thank you for your answers, 
>
> Regards, 
>
> ?ric


Omnimark and its ODBC libraries?

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Mar 24 22:16:02 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:25 2004
Subject: XML convertor generator
In-Reply-To: <36F91BC9.4310@skynet.be>
Message-ID: <4.1.19990325091311.00ba01c0@steptwo.com.au>

At 03:08 25/03/1999 , JPA wrote:
  | Hello,
  | 
  | I'm currently working on an xml convertor-generator. When finished, the
  | tool will, if you take the bother to type the structure of your input
  | format and mappings on entities and attributes, construct a convertor. 
  | There's no documentation as yet, and some stuff missing (escaping, for
  | one thing), but if there's enough interest I'll put it on a website as
  | is.
  | 
  | 
  | Paul Janssens - paul.janssens@skynet.be

Paul,

Not wishing to rain on your parade, but aren't
you re-inventing the wheel here?

Will your solution do anything that Perl or
Omnimark can't already do?

(Just trying to save you a lot of time.)

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Wed Mar 24 22:21:03 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <14073.23919.284210.386065@localhost.localdomain>; from David Megginson on Wed, Mar 24, 1999 at 05:02:34PM -0500
References: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <19990324172040.I21831@w3.org>

> For those of you who don't know, the current incarnation of the
> emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
> world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,

  No KDE/Gnome war here, please, however I'm one of the Gnome developpers.

> so it is with mixed feelings that I draw attention to the fact that
> KOffice for KDE (see article [2]) uses XML-based save formats for
> *all* of its applications (word processor, spreadsheet, formula
> designer, presentation manager, etc. etc.).

  A large number of Gnome apps are also using XML or moving to XML formats.
A very good example is glade, the GTK application builder, which saves
it state as an XML file.
  
> To be fair, the main Gnome spreadsheet also uses a (no doubt
> incompatible) XML-based save format.

  Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses namespaces.
I'm pretty sure it's incompatible, since when I coded the XML export I didn't
know that KDE would do alike. There is a virtual cookie reward to the first
sending me a good example of KDE spreadsheet XML file :-) . A clean DTD
would be even better.

Daniel

-- 
	    [Yes, I have moved back to France !]
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux, WWW, rpmfind,
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | rpm2html, XML,
http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Williams.Arumainathan at fmr.com  Wed Mar 24 22:47:47 1999
From: Williams.Arumainathan at fmr.com (Arumainathan, Williams)
Date: Mon Jun  7 17:10:26 2004
Subject: XML Documents
Message-ID: 

Hi, 
I have just started learning XML. Can you suggest good website names
please

Thank you,
Williams.

> -----Original Message-----
> From:	James Robertson [SMTP:jamesr@steptwo.com.au]
> Sent:	Wednesday, March 24, 1999 6:15 PM
> To:	xml-dev@ic.ac.uk
> Subject:	Re: XML convertor generator
> 
> At 03:08 25/03/1999 , JPA wrote:
>   | Hello,
>   | 
>   | I'm currently working on an xml convertor-generator. When finished,
> the
>   | tool will, if you take the bother to type the structure of your input
>   | format and mappings on entities and attributes, construct a convertor.
> 
>   | There's no documentation as yet, and some stuff missing (escaping, for
>   | one thing), but if there's enough interest I'll put it on a website as
>   | is.
>   | 
>   | 
>   | Paul Janssens - paul.janssens@skynet.be
> 
> Paul,
> 
> Not wishing to rain on your parade, but aren't
> you re-inventing the wheel here?
> 
> Will your solution do anything that Perl or
> Omnimark can't already do?
> 
> (Just trying to save you a lot of time.)
> 
> Cheers,
> 
> James
> 
> 
> -------------------------
> James Robertson
> Step Two Designs Pty Ltd
> SGML, XML & HTML Consultancy
> http://www.steptwo.com.au/
> jamesr@steptwo.com.au
> 
> "Beyond the Idea"
>  ACN 081 019 623
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mvidal at umiacs.umd.edu  Wed Mar 24 22:51:36 1999
From: mvidal at umiacs.umd.edu (Maria Esther Vidal)
Date: Mon Jun  7 17:10:26 2004
Subject: XML DTD to relational 
Message-ID: <199903242251.RAA12467@loomba.umiacs.umd.edu>

Hello,

I would like to know if there is a Java library that creates a
relational schema from an XML DTD or a Java library that parses
an XML DTD?

Many thanks,

Maria Esther Vidal

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 24 22:53:19 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
References: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <36F9689A.CEDA6214@prescod.net>

David Megginson wrote:
> 
> For those of you who don't know, the current incarnation of the
> emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
> world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,
> so it is with mixed feelings that I draw attention to the fact that
> KOffice for KDE (see article [2]) uses XML-based save formats for
> *all* of its applications (word processor, spreadsheet, formula
> designer, presentation manager, etc. etc.).

Note that other standards in use in KOffice include CORBA, and
Linuxdoc/SGML (for KDE documentation). These guys obviously have a
standards focus.

Probably not coincidentally they use Python for scripting and formulas.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Wed Mar 24 22:57:37 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <199903242256.XAA04448@sonne.darmstadt.gmd.de>

David Megginson  wrote at 24 Mar 99, 17:02:

> There's also a hot rumour [3] that Microsoft has assigned 37
> programmers to work on a Linux port of MS Office.

I was very surprised by David Megginsons note that MS is porting
Office. In the hope not to infringe copyrights, here is a partial
translation of the "Heise newsticker" article. C't can be considered
one of (if not the) leading computer magazines in German. 

	++im

--- snip --
Rumours are out for a while, but for the first time there are
indications: Microsoft is porting their popular Office suite to
Linux. c't [1] was told from good authority, that there was a
project formed in Redmond. Following the source there are 37
developers working on the port of Office to Linux. 

It's expected, that Microsoft will announce the activity during
CeBIT [2] and will give a time schedule for completation. [...]

[1] http://www.heise.de/ct/
[2] http://www.cebit.de/
--- snip ---


--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 22:59:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <19990324172040.I21831@w3.org>
References: <14073.23919.284210.386065@localhost.localdomain>
	<19990324172040.I21831@w3.org>
Message-ID: <14073.28055.304975.377332@localhost.localdomain>

Daniel Veillard writes:

 >   Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses
 > namespaces.  I'm pretty sure it's incompatible, since when I coded
 > the XML export I didn't know that KDE would do alike. There is a
 > virtual cookie reward to the first sending me a good example of KDE
 > spreadsheet XML file :-) . A clean DTD would be even better.

Doesn't the recipe for virtual cookies come with the Gnu Emacs
distribution?

Anyway, let's get this right -- I think that it's healthy for both
Gnumeric and the KOffice Spreadsheet program both to exist, but there
is no excuse for them to use entirely incompatible formats.  As a
matter of fact, if we could convince KDE and Gnome to use compatible
XML formats for lots of things (like interface construction), the
media's predictions of a Linux fracture will be proven to be hot air.

Do the Gnome and KDE people talk to each other much?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 23:00:52 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:26 2004
Subject: How to convert an XML file to an Access database ...
In-Reply-To: <4.1.19990325090956.00bda4c0@steptwo.com.au>
References: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca>
	<4.1.19990325090956.00bda4c0@steptwo.com.au>
Message-ID: <14073.28257.44465.534584@localhost.localdomain>

James Robertson writes:

 > Omnimark and its ODBC libraries?

Or Perl, or Java, or (probably) Python.  There are lots of choices,
free and commercial.


All the best,


Daivd

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 24 23:12:38 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:26 2004
Subject: XML conference
References: <36020bcc.240299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <36F96F3C.4A498FBB@prescod.net>

LUCIO PICOLLI wrote:
> 
> I am considering attending the 'XML One' conference planned for May 24 at
> Austin, TX. I am searching for details so i can convince my manager to
> pay for the conference fees. However the official conference web site is
> under construction. I guessed that most of the speakers would come from
> this interest group. So if anyone has info about the conference please
> let me know.

I will be speaking at XML One on Python/XML and also on object/XML
bridging.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sharris at primus.com  Thu Mar 25 00:31:28 1999
From: sharris at primus.com (Steve Harris)
Date: Mon Jun  7 17:10:26 2004
Subject: Architectures capability question - attribute values to element G
	I?
Message-ID: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1>

Is it possible to use Architectures to map an attribute value to an
element GI in the target architecture (that is, to 'dynamically specify'
the architectural form)? This desire is in reverse to the common example
usage of the 'renamer-att' architecture support attribute. I have seen
this idea kicked around in various discussions, but cannot find any
documentation or examples to back up the claim that it's possible.
  The desired transformation would be from


bar
gorp

  to


bar
gorp

Is this really a job for something like DSSSL or XSL? I'd like to find
the limits of what Architectures can do. Please advise.


Steven E. Harris
Software Engineer
PRIMUS

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 00:38:15 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:26 2004
Subject: XML Documents
In-Reply-To: 
Message-ID: <4.1.19990325113516.00bc69e0@steptwo.com.au>

At 08:46 25/03/1999 , Arumainathan, Williams wrote:

  | Hi, 
  | I have just started learning XML. Can you suggest good website names
  | please
  | 
  | Thank you,
  | Williams.
  | 
  | > -----Original Message-----
  | > From:	James Robertson [SMTP:jamesr@steptwo.com.au]
  | > Sent:	Wednesday, March 24, 1999 6:15 PM
  | > To:	xml-dev@ic.ac.uk
  | > Subject:	Re: XML convertor generator
  | > 
  | > At 03:08 25/03/1999 , JPA wrote:
  | >   | Hello,
  | >   | 
  | >   | I'm currently working on an xml convertor-generator. When finished,
  | > the
  | >   | tool will, if you take the bother to type the structure of your input
  | >   | format and mappings on entities and attributes, construct a
convertor.
  | > 
  | >   | There's no documentation as yet, and some stuff missing
(escaping, for
  | >   | one thing), but if there's enough interest I'll put it on a
website as
  | >   | is.
  | >   | 

Well, regarding websites for XML conversion tools:

Omnimark is easy: www.omnimark.com (have a look at OmnimarkLE in particular).

Anyone have some good sites for Perl and Python with respect
to XML conversion?

Hope this helps,

James

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar 25 01:17:03 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:26 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de> from "Ronald Bourret" at Mar 24, 99 07:38:47 pm
Message-ID: <199903250221.VAA18169@locke.ccil.org>

Ronald Bourret scripsit:

> I'm having trouble imagining how a CDATA section can have semantic meaning 
> in all but the most abusive ways.  (Hmmm, there's a CDATA section.  Fire up 
> the pizza delivery DLL.)  Could you give an example?  Thanks.

For one thing, a CDATA section can contain only characters present in the
repertoire of the current encoding (no character references).  Some
people may depend on this property.

(I think this example is weak myself, but it *has* come up.)

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smo at jst.com.au  Thu Mar 25 01:23:06 1999
From: smo at jst.com.au (Steve Oldmeadow)
Date: Mon Jun  7 17:10:26 2004
Subject: XML convertor generator
Message-ID: <004701be765d$c2007ac0$02c809c0@stimpy>


-----Original Message-----
From: James Robertson 
To: xml-dev@ic.ac.uk 
Date: 25/03/1999 06:38
Subject: Re: XML convertor generator


>At 03:08 25/03/1999 , JPA wrote:
>  | Hello,
>  |
>  | I'm currently working on an xml convertor-generator. When finished, the
>  | tool will, if you take the bother to type the structure of your input
>  | format and mappings on entities and attributes, construct a convertor.
>  | There's no documentation as yet, and some stuff missing (escaping, for
>  | one thing), but if there's enough interest I'll put it on a website as
>  | is.
>  |
>  |
>  | Paul Janssens - paul.janssens@skynet.be
>
>Paul,
>
>Not wishing to rain on your parade, but aren't
>you re-inventing the wheel here?
>
>Will your solution do anything that Perl or
>Omnimark can't already do?


With that sort of attitude XML would never have gotten off the ground.  Perl
and OmniMark???  You must be a masochist.

In reply to the original post:  Paul I would be interested if you are making
the source available and it is in Java.

Steve Oldmeadow
Justice Systems Technologies


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Thu Mar 25 03:25:38 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
Message-ID: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>

Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
hypeware about this and I understand that its just a suggestion for a
proposal for a note for a draft for a recommendation. Whatever.
I want my XQL! 

Seriously, all ranting aside, I haven't seen any talk here or in XSL
listserv land about it yet (recently). The proposal seems complete enough to
me for someone to have at least announced a beta implementation of it. I'd
be happy with mostly unfinshed code if it were written in Java.

So does anybody have a clue about this? I know about XSL so please don't
send me down that path. I also know about the Datachannel attempt. 

If not, I'm tempted to write something myself. I'm not sure about a couple
of things because its still just off the top of my head so far.

Ok, its in Java. It uses some free XML parser, probably XML4J because its
the one I'm most familiar with. The XQL syntax parser will be written in
ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well
formed XML. This can be handled easily by XML4J's ability of any node in any
tree (or transformed sub-tree) to print itself in XML to any stream. XML4J
has a nice getNodesByName() mathod that can operate at any level of the tree
returning a NodeList of siblings with that tag name. Wrapping a result tree
in  and iterating the NodeList gets you the
simplist query.

Internally the result set is just another DOM tree so you should be able to
add the .jar file to your Java app and thus satisfy that type of XQL result.
The input can be done in a variety of ways. I assume that the Perl module
XML::XQL can be used in a CGI context to extract the XQL query, execute it
and return either XML or XSL transformed output to the calling app(browser.)
Likewise, a Java servlet could do the same thing.

Cons: Xml4J doesn't yet handle PI's so its maybe not the overall best
solution. (I may be wrong about this, IBM uploaded a new major release that
may have fixed it.) Its just the one I'm comfortable with, at the moment. On
my hard drive are XML parsers from Sun, Microsoft, Oracle, James Clark and
one or two others I haven't had time to play with yet.

I don't care about efficiency or optimization. All partially created result
sets will live in memory till they are ready to be output. I also don't care
about searching multiple files, although that should be realtively easy to
add. (I'm still confused about XML repositorys. Would XQL have to understand
directory paths? Does XQL need to be able to follow XLinks?)

I'm leaving out sequences but I may add them in (much) later. Return values
(analog to SQL's SELECT) are important to my application, as are
conditionals. 

Unless someone warns me that I'm clueless (which is usually the case,) I'll
post a cut of the ANTLR grammer as soon as I get a working one. I'll
probably put it on my web site.

Ed


Ed Howland
ed@dega.com
http://www.dega.com 
"As your attorney, I advise you to take some adrenalchrome"


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 03:39:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML and (K)Office
In-Reply-To: <36F9689A.CEDA6214@prescod.net>
References: <14073.23919.284210.386065@localhost.localdomain>
	<36F9689A.CEDA6214@prescod.net>
Message-ID: <14073.44996.847735.659024@localhost.localdomain>

Paul Prescod writes:

 > Note that other standards in use in KOffice include CORBA, and
 > Linuxdoc/SGML (for KDE documentation). These guys obviously have a
 > standards focus.

Gnome uses DocBook (!!!) and CORBA.

 > Probably not coincidentally they use Python for scripting and formulas.

I'll try not to hold that against them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 03:41:28 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
In-Reply-To: <004701be765d$c2007ac0$02c809c0@stimpy>
Message-ID: <4.1.19990325143624.00ca5d60@steptwo.com.au>

At 11:21 25/03/1999 , Steve Oldmeadow wrote:

  | >At 03:08 25/03/1999 , JPA wrote:
  | >  | Hello,
  | >  |
  | >  | I'm currently working on an xml convertor-generator. When
finished, the
  | >  | tool will, if you take the bother to type the structure of your input
  | >  | format and mappings on entities and attributes, construct a convertor.
  | >  | There's no documentation as yet, and some stuff missing (escaping, for
  | >  | one thing), but if there's enough interest I'll put it on a website as
  | >  | is.
  | >  |
  | >  |
  | >  | Paul Janssens - paul.janssens@skynet.be
  | >
  | >Paul,
  | >
  | >Not wishing to rain on your parade, but aren't
  | >you re-inventing the wheel here?
  | >
  | >Will your solution do anything that Perl or
  | >Omnimark can't already do?
  | 
  | With that sort of attitude XML would never have gotten off the ground.
Perl
  | and OmniMark???  You must be a masochist.

Why?

I can think of two situations:

1. You want to develop a new conversion tool, either for
   the kudos, or for the money. If so, go for it. 

   But be warned, conversion tools need to be powerful in order
   to be useful (I should know, I spend most of my life
   converting to and from SGML/XML).

2. You have a practical problem to solve that involves converting
   files to XML. 

   If so, why on earth wouldn't you use existing off-the-shelf tools
   to do the work? Especially if they are freely available.

Now, from the original e-mail, I assumed case 2, but I
could be wrong.

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Mar 25 03:53:32 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
References: <4.1.19990325091311.00ba01c0@steptwo.com.au>
Message-ID: <36F9B224.CE017C45@manhattanproject.com>

James Robertson wrote:
| At 03:08 25/03/1999 , JPA wrote:
| | Hello,
| |
| | I'm currently working on an xml convertor-generator. When finished, the
| | tool will, if you take the bother to type the structure of your input
| | format and mappings on entities and attributes, construct a convertor.
| | There's no documentation as yet, and some stuff missing (escaping, for
| | one thing), but if there's enough interest I'll put it on a website as is.
| |
| | Paul Janssens - paul.janssens@skynet.be
| 
| Paul,
| 
| Not wishing to rain on your parade, but aren't
| you re-inventing the wheel here?

Actually, a program which created an efficient
program to convert XML conforming to a specific
DTD to another product would be a very cool 
invention, very different from using Perl 
and/or Omnimark.

I have Omnimark programs which take a great
deal of processing power (I'd hate to see the 
Perl equivalent).  Cutting it in half with a 
program that generated a program would be 
very cool indeed.   What kind of 'efficiencies'
do you get when you remove the interpreted layer?

I'm reading this that you are more or less
doing a YACC thing?  Is this a correct
interpretation?  Will it do SGML?
(I guess I can run it through nsgmls 
to make the XML equivalent first.)
Is it open source?  Hopefully it 
will generate C code (for speed).

Clark Evans

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 04:25:22 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
In-Reply-To: <36F9B224.CE017C45@manhattanproject.com>
References: <4.1.19990325091311.00ba01c0@steptwo.com.au>
Message-ID: <4.1.19990325151909.00baac70@steptwo.com.au>

At 13:48 25/03/1999 , Clark Evans wrote:

  | James Robertson wrote:
  | | At 03:08 25/03/1999 , JPA wrote:
  | | | Hello,
  | | |
  | | | I'm currently working on an xml convertor-generator. When finished, the
  | | | tool will, if you take the bother to type the structure of your input
  | | | format and mappings on entities and attributes, construct a convertor.
  | | | There's no documentation as yet, and some stuff missing (escaping, for
  | | | one thing), but if there's enough interest I'll put it on a website
as is.
  | | |
  | | | Paul Janssens - paul.janssens@skynet.be
  | | 
  | | Paul,
  | | 
  | | Not wishing to rain on your parade, but aren't
  | | you re-inventing the wheel here?
  | 
  | Actually, a program which created an efficient
  | program to convert XML conforming to a specific
  | DTD to another product would be a very cool 
  | invention, very different from using Perl 
  | and/or Omnimark.

This _would_ be useful.

However, to be useful, it would have to support:

* Regular expressions.
* Complex data types, especially things like hash
  tables.
* Some form of "reference"-like lookahead.
* Context-sensitive code based on the current
  SGML state.

These are the things that I use every day.

Converting from legacy (or as I have recently
heard it called, "heritage") data to XML is
not simple. If the source is very consistent,
you're fine. 

Otherwise, it's always a struggle, in which
you use every tool in your toolbox.

  | I have Omnimark programs which take a great
  | deal of processing power (I'd hate to see the 
  | Perl equivalent).  Cutting it in half with a 
  | program that generated a program would be 
  | very cool indeed.   What kind of 'efficiencies'
  | do you get when you remove the interpreted layer?

Omnimark is actually pretty good. On the basis
of the speeds reported on this mailing list, I
would rate it quite fast, especially on large
data sets.

But of course, if you're doing a complex
conversion, then your code is going to be
slow. Fact of life.

  | I'm reading this that you are more or less
  | doing a YACC thing?  Is this a correct
  | interpretation?  Will it do SGML?
  | (I guess I can run it through nsgmls 
  | to make the XML equivalent first.)
  | Is it open source?  Hopefully it 
  | will generate C code (for speed).

A YACC-like tool would be way cool.

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Mar 25 05:07:43 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:27 2004
Subject: Architectures capability question - attribute values to element G
	I?
In-Reply-To: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1> (message from
	Steve Harris on Wed, 24 Mar 1999 16:24:25 -0800)
References: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1>
Message-ID: <199903250453.WAA00944@bruno.techno.com>

[Steve Harris:]

> Is it possible to use Architectures to map an attribute value to an
> element GI in the target architecture (that is, to 'dynamically
> specify' the architectural form)? This desire is in reverse to the
> common example usage of the 'renamer-att' architecture support
> attribute. I have seen this idea kicked around in various
> discussions, but cannot find any documentation or examples to back
> up the claim that it's possible.  The desired transformation would
> be from

> 
> bar
> gorp
> 
>   to
> 
> 
> bar
> gorp
> 
> Is this really a job for something like DSSSL or XSL? I'd like to find
> the limits of what Architectures can do. Please advise.

This example happens to be an especially natural case for the use of
an inheritable architecture.  No renaming attribute is required.  Let
me rename your "type" attribute to "orlando", and provide an "orlando
architecture meta-DTD" to make things clearer.

The orlando architecture (a meta-DTD):









bar
gorp




bar
gorp
 

The value of the orlando attribute is, in effect, "the element type
name for orlando purposes".  This is the most fundamental thing to
know about how inheritable architectures work.

It occurs to me, on account of your use of the phrase "dynamically
specify", that maybe you're asking something more subtle, which I
would rephrase as follows:

  "Does the value of the architectural form name attribute have to be
  #FIXED in the DTD?"

The answer is "No."  There doesn't even have to be a DTD.  If you can
make it be #FIXED in the DTD, or at least default it in the DTD, that
can save a lot of markup from having to be specified in the instance,
because you won't have to say, e.g. "orlando=foo" in every 
tag.  Even if there is a DTD, there is no requirement that the DTD's
GIs ("generic identifiers" or "element type names") correspond
consistently with the GIs of any of the meta-DTDs that are being
inherited.  So, it's perfectly OK for a particular  element to
be, in orlando terms, a , and, even in the same document, for
another  to be a  in orlando terms.  (It may seem a bit
odd, but it does happen.  In fact, there's one place in the Topic Maps
architecture (which is about to be an ISO standard, BTW) where it
happens: a  architectural form in the Topic Maps architecture
is sometimes a  and other times a  in the HyTime
architecture.)

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From avirr at LanMinds.Com  Thu Mar 25 05:54:29 1999
From: avirr at LanMinds.Com (Avi Rappoport)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
In-Reply-To: 
 <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>
References: 
 <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>
Message-ID: 

At 7:24 PM -0800 3/24/1999, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
>hypeware about this and I understand that its just a suggestion for a
>proposal for a note for a draft for a recommendation. Whatever.
>I want my XQL!

There's a great article by Lisa Rein on the W3C Query Workshop late 
last year -- I'm sure it's still at XML.com.  There are links from 
the article to the position papers of the participants, and I found 
them fascinating and enlightening.  There are a *lot* of issues to be 
solved!

Avi

________________________________________________________________
Avi Rappoport, Search Tools Maven: 
Guide to Site Indexing and Local Search Engines: 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vivi at odi.com  Thu Mar 25 06:22:09 1999
From: vivi at odi.com (Vittorio Viarengo)
Date: Mon Jun  7 17:10:27 2004
Subject: ANN: Managing XML Data
Message-ID: <000101be7687$6516ad00$fb1f03c6@durango.datachannel.com>

All,

I hope you'll forgive me for this announcement but I saw eXcelon mentioned
in a couple of messages and given that eXcelon 1.0 is now shipping, I
thought I'd send you directions for downloading it so that you can try it
yourself.

eXcelon is a high performance, highly scalable XML data server. It's used
to build enterprise XML Web applications and can be used with existing data
sources as a middle-tier application cache or in standalone mode as a
back-end data source. Key features include:

- Support for XML (eXcelon efficiently stores well-formed XML down to the
  element level without requiring prior knowledge of the document schema
- Support for the DOM
- Support for XQL and structural and content indexes
- In-memory distributed XML database
- XML Update grammar to declaratively modify XML documents
- Comprehensive tool suite (including a visual XQL query builder and
  a DCD editor with code generator)

Unlike with the relational approach, eXcelon fully leverages XML flexibility
and extensibility by storing XML in its native format.

You can find the eXcelon evaluation version on the Object Design Web site
(http://www.objectdesign.com/excelon).

Please feel free to contact me if you need additional technical information
regarding eXcelon.

I hope you find this useful

Sorry for the intrusion

Regards

Vittorio




xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Mar 25 06:27:31 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de>
Message-ID: <00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au>

> I'm having trouble imagining how a CDATA section can have semantic meaning
> in all but the most abusive ways.  (Hmmm, there's a CDATA section.  Fire
up
> the pizza delivery DLL.)  Could you give an example?  Thanks.

The different ways of expressing character data (literal, CDATA section,
character references) as well as other things like ignorable whitespace,
comments, even physical (ie entity) structure, etc are irrelevant for most
applications, but there is the odd application that wants to know about such
things. The standard example is an XML editor.

James





xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Mar 25 06:55:08 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:27 2004
Subject: XML Documents
References: 
Message-ID: <018201be768c$29c18540$0300000a@cygnus.uwa.edu.au>

> Hi,
> I have just started learning XML. Can you suggest good website names
> please

If you've just started, try http://www.xmlinfo.com/newcomers/ which has
links to the other sites I'd recommend.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 07:12:07 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:27 2004
Subject: XML Documents
In-Reply-To: 
References: 
Message-ID: 


* Williams Arumainathan
|
| I have just started learning XML. Can you suggest good website names
| please




--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Thu Mar 25 07:15:26 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
Message-ID: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>

Sorry for the cross post.

I read most of those position papers as well. But the one by Jonathan Robie,
Texcel, Inc. Joe Lapp, webMethods, Inc. and David Schach, Microsoft
Corporation seemed the most complete. It even has a BNF for a parser for
XQL.

It occurred to me that someone might have taken that BNF and made it into
something by now. One assumes that MS or the one other co-author's companies
might be doing that in the their skunk works, about to release something.

The papers described different syntatical forms. The XSLish one of XQL(MS)
seemed useful especially in light of embedding it in CGI-like urls.

I am just experimenting, but in the hope that this might become another 
xml-dev mini-project. 

So far I have the BNF translated to a ANTLR grammer. I had to fix one
infinite recursive definition in the original file (filter). I also had to
decide which things were tokens and which were true productions. It is still
broken at this point because it generates many non-determinisms. Most of
these are due to the tendancy to represent things like Text and NCName as
starting out with Letter and continuing through to Letters again via some
path. I'm going to have to research how to do this better. I'd like to
preserve the nomenclature of the article so everybody is operating with the
same documentation, which is at http://www.w3.org/TandS/QL/QL98/pp/xql.html
BTW.

The parser it generates works on a few items but not entirely. If anybody
shows any interest and has experience with ANTLR, I'll post it and we can
collaborate.

Ed


-----Original Message-----
From: Avi Rappoport [mailto:avirr@LanMinds.Com]
Sent: Wednesday, March 24, 1999 9:53 PM
To: Ed Howland; 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'
Subject: Re: Whence XQL?


At 7:24 PM -0800 3/24/1999, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
>hypeware about this and I understand that its just a suggestion for a
>proposal for a note for a draft for a recommendation. Whatever.
>I want my XQL!

There's a great article by Lisa Rein on the W3C Query Workshop late 
last year -- I'm sure it's still at XML.com.  There are links from 
the article to the position papers of the participants, and I found 
them fascinating and enlightening.  There are a *lot* of issues to be 
solved!

Avi

________________________________________________________________
Avi Rappoport, Search Tools Maven: 
Guide to Site Indexing and Local Search Engines:


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 07:37:40 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com>
References: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com>
Message-ID: 


* Lars Marius Garshol
|
| Should we perhaps make standalone a boolean instead?  It can only have
| two values anyway, and this will spare us a lot of
| standalone.equals(this or that).

* roddey@us.ibm.com
| 
| I did that at first with my internal event APIs, but it didn't work
| out.  There is then no way of knowing whether the document *really*
| said yes or no, or whether it was just no there at all and the
| default was used. This prevents the recreation of the original
| document.

Given that this is supposed to be the handler for lexical information,
where this sort of thing does matter, I agree. It should be a string.
Don't know how I managed to overlook that.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rxcharul at cs.twsu.edu  Thu Mar 25 08:05:18 1999
From: rxcharul at cs.twsu.edu (Madan Mohan Ranganath)
Date: Mon Jun  7 17:10:27 2004
Subject: XML quick question
In-Reply-To: 
Message-ID: 


Hello,
 
I am new to XML and just wrote my first program as shown below. I used
Internet Explorer 5.0 as the browser to read the XML document on
Windows-95(I came to know that the browser supports reading XML
documents). But the problem I am facing is that the browser is printing
the entire document with the tags which it should not. Anyone please
inform me where I am going wrong. I gave ".xml" as the extension for the
XML file name.
 

 Hello,World!

Regards,
Madan,                                 



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar 25 08:28:33 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca>

At 09:00 AM 3/24/99 -0500, David Megginson wrote:
>By the same argument,
>

x="1"> >and >

>are different... David is right. It's too late now, because DOM level 1 wrote CDATA sections into the spec so we're stuck with 'em - it's a pity we didn't have the infoset back then. (I assume it won't include them, right David?) -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 25 08:28:57 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:27 2004 Subject: DOM CDATA vs Normalization Message-ID: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca> At 11:55 AM 3/24/99 -0500, Bill la Forge wrote: >Normalization of an element combines various text objects into a single >text object. Does it then merge text and CDATA objects to a single object? >And what about ignorable whitespace? XML does not repeat NOT have anything such as "ignorable" whitespace. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 25 08:29:45 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:28 2004 Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1) Message-ID: <3.0.32.19990324145555.00e6dd44@pop.intergate.bc.ca> At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote: >I wasn't even going to reply, but then I remembered that the real question >here is whether SAX (not the DOM) should tell people about CDATA sections. > I think the answer is yes. The implication is that a parser that doesn't pass on word of CDATA sections is a second-rate parser. Hrummph. Is this not a slippery- slope that puts us on the road to reporting whether single or double quotes were used for attribute values? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Gareth.sbradley at adv.sonybpe.com Thu Mar 25 09:01:47 1999 From: Gareth.sbradley at adv.sonybpe.com (Gareth Sylvester-Bradley) Date: Mon Jun 7 17:10:28 2004 Subject: XML convertor generator In-Reply-To: Message-ID: <000001be711d$7e9f4530$a32fc22b@carrion> > > | I'm currently working on an xml convertor-generator > > > > | Paul Janssens - paul.janssens@skynet.be > > In reply to the original post: Paul I would be interested if you are > making the source available and it is in Java. > > Steve Oldmeadow > Justice Systems Technologies Ditto if either Java or C++. Cheers -- Gareth SB xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Mar 25 09:06:25 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:28 2004 Subject: Whence XQL? Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1712@EUKBANT101> > -----Original Message----- > From: Ed Howland [SMTP:Ed@dega.com] > > Ok, so now it is April (or thereabouts) and still no XQL. I've read all > the > hypeware about this and I understand that its just a suggestion for a > proposal for a note for a draft for a recommendation. Whatever. > I want my XQL! > I haven't followed the Java implementations very closely, but are you saying that the perl implementation of XQL is the only one? Chalk one up... Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Mar 25 09:21:30 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:28 2004 Subject: XML and (K)Office Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> > -----Original Message----- > From: David Megginson [SMTP:david@megginson.com] > > Daniel Veillard writes: > > > Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses > > namespaces. I'm pretty sure it's incompatible, since when I coded > > the XML export I didn't know that KDE would do alike. There is a > > virtual cookie reward to the first sending me a good example of KDE > > spreadsheet XML file :-) . A clean DTD would be even better. > > Doesn't the recipe for virtual cookies come with the Gnu Emacs > distribution? > > Anyway, let's get this right -- I think that it's healthy for both > Gnumeric and the KOffice Spreadsheet program both to exist, but there > is no excuse for them to use entirely incompatible formats. As a > matter of fact, if we could convince KDE and Gnome to use compatible > XML formats for lots of things (like interface construction), the > media's predictions of a Linux fracture will be proven to be hot air. > Although I agree to an extent, if they have different feature sets it's pretty unlikely that you're going to get an entirely perfect agreement on a spreadsheet DTD. However, that's the beauty of XML. Writing a converter from one format to another is trivial in the extreme, so it's not a huge issue in my (humble) opinion. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 25 09:49:13 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:28 2004 Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandlerdraft v.1.1) Message-ID: <01BE76AC.EFC59B30@grappa.ito.tu-darmstadt.de> Tim Bray wrote: > At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote: > >I wasn't even going to reply, but then I remembered that the real question > >here is whether SAX (not the DOM) should tell people about CDATA sections. > > I think the answer is yes. > > The implication is that a parser that doesn't pass on word of CDATA > sections is a second-rate parser. Hrummph. Is this not a slippery- > slope that puts us on the road to reporting whether single or double > quotes were used for attribute values? -Tim Actually, the implication is that a parser that doesn't pass on word of CDATA sections is one that doesn't support LexicalHandler. Maybe we should ask what kind of applications are likely to use LexicalHandler (mine certainly won't -- I just want the data). The obvious groups are DOM builders and editors. Preserving CDATA sections in editors is a nice thing to do -- I know that I would appreciate it as a user. If LexicalHandler is aimed at a different audience, then somebody please say so. As to single and double quotes, I'm quite happy to draw a line in the sand before we get to the road that leads to the brink of the slippery slope. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Thu Mar 25 09:59:50 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:28 2004 Subject: XML convertor generator Message-ID: <36FA087E.161A@skynet.be> In answer to questions: Conversion is TO xml. The intention of the tool is to ease the conversion of non-markup data to markup, be it once-and-for-all or repeatedly (source code analysis.) Coded in C and bison for maximum throughput. Source will be open (after some more cleanup). As I said early, there's a lot of missing functionality at the moment (Unicode, escaping, DTD generation, a DTD for the input format) but it can allready do simple stuff, like converting simple mathematical expressions to math-ml. It makes sense to have some semantical additions during the conversion (If you're converting a programming language, it would be nice for variable or procedure references to have an IDREF to the variable or procedure declaration, becauses it adds validition) but a lot of this stuff can be done as postprocessing, transforming xml to xml, which is more elegantly done in scheme-like languages. Paul Janssens - paul.janssens@skynet.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 10:01:51 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:28 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <14068.24150.843634.988657@localhost.localdomain> References: <14068.24150.843634.988657@localhost.localdomain> Message-ID: * David Megginson | | public abstract void startCDATA () | throws SAXException; | | public abstract void endCDATA () | throws SAXException; This implies that the parser reports the contents of CDATA sections as separate DocumentHandler.characters events, which is of course the most natural way to implement things anyway. However, the 1999-03-12 list of core features contains this: http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). This is potentially problematic, since it's unspecified what the parser should do about CDATA sections in this case. (I suspect we will see more problems of this kind when we start using really using and stacking filters.) Should they be normalized, or should they be reported separately? (Ie: what is consecutive text, exactly?) The same problem appears with entity boundaries and character references. I assume most users of normalize-text will want consecutive text to be interpreted in the logical view of the document, rather than the lexical view. Otherwise the DocumentHandler will receive different events in these two cases: A problematic case. and A case. which is rather fragile, and this behaviour should be avoided, IMHO. So basically the problem is that normalize-text and LexicalHandler don't go well together. You can have one, but not both at the same time, unless the driver changes it's behaviour. In other words, this seems to require the driver to have explicit knowledge about normalize-text. Possible solutions: - reject normalize-text true if a LexicalHandler has been registered, and reject LexicalHandler registration if normalize-text has been set to true - make normalize-text have a logical interpretation by default, and switch to lexical if a LexicalHandler has been registered - make normalize-text always have a lexical interpretation - have separate normalize-text-logical and normalize-text-lexical events, with reject-behaviour for the first Thoughts? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david.hitch at dial.pipex.com Thu Mar 25 10:37:08 1999 From: david.hitch at dial.pipex.com (David Hitchcock) Date: Mon Jun 7 17:10:28 2004 Subject: XML Documents Message-ID: <01be76a1$e04800e0$0100007f@ketlux03> Hi Williams You could try El.pub - interactive publishing news and resources, at: http://www.pira.co.uk/IE, particularly the standards section: http://www.pira.co.uk/IE/top011.htm - follow the links - and the products section at: http://www.pira.co.uk/IE/base09.htm#SGML Also available is a free Weekly newsletter update service for the site sign-up (email only required) on the welcome page: http://www.pira.co.uk/IE Incidentally, list members may be interested to know that the European Commission launched a Call for Project Proposals under the new 5th Framework R&D Proframme (FP5) on 19 March, 1999. Accepted R&D projects can qualify for 50% funding by the EU. The message I got clearly from attending FP5 presentations was that "cross-pond" collaboration is seen as positive. More information, including links to official sources available at: http://www.pira.co.uk/IE - see FP5 heading on top right. I do, of course, only speak for myself here. Best ---> David ********************************* David Hitchcock Logical Events Ltd. tel: +44/ (0)181 255 7084 +44/ (0)181 255 7085 email: david.hitch@dial.pipex.com web: http://www.pira.co.uk/IE ********************************* -----Original Message----- From: Arumainathan, Williams To: xml-dev@ic.ac.uk Date: Wednesday, March 24, 1999 11:49 PM Subject: XML Documents >Hi, >I have just started learning XML. Can you suggest good website names >please > >Thank you, >Williams. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 11:27:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au> References: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de> <00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au> Message-ID: <14074.6976.284974.722431@localhost.localdomain> James Tauber writes: > The different ways of expressing character data (literal, CDATA > section, character references) as well as other things like > ignorable whitespace, comments, even physical (ie entity) > structure, etc are irrelevant for most applications, but there is > the odd application that wants to know about such things. The > standard example is an XML editor. Right, but the fact that *someone* wants something shouldn't automatical lead to its inclusion in standards. Standards benefit from the network effect -- their usefulness is proportional to the square of the number of users -- so there must be a large potential number of users to justify the extra cost of developing, publishing, documenting, implementing, and maintaining a standard. If we're talking about, say, five or ten potential users, the network effect just isn't all that exciting. Standards also grow easily but shrink with difficulty: if in v.1 you leave out a feature that turns out to be necessary, it is usually not difficult to include the feature in v.2 once the need for it has been proven in real use; if in v.1 you include a feature that turns out not to be necessary (i.e. notations and unparsed entities in XML), then it sticks to future versions of the spec like gum in your hair. In any case, please remember that I am not actually proposing removing CDATA boundaries from LexicalHandler -- I do want to support the DOM. I'm just whining and/or drawing lessons. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 11:33:54 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: Whence XQL? In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> Message-ID: <14074.7638.522376.583407@localhost.localdomain> I missed the start of this thread. Did the poster really want to know where XQL came from (whence), or was the poster interested in where it's going (whither)? Since SHAKESPEARE IN LOVE swept the Oscars, I expect people to get their 16th-century English usage right. Pedanticly yours, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 11:38:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca> References: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca> Message-ID: <14074.8015.743294.916312@localhost.localdomain> Tim Bray writes: > David is right. It's too late now, because DOM level 1 wrote > CDATA sections into the spec so we're stuck with 'em - it's a > pity we didn't have the infoset back then. (I assume it won't > include them, right David?) -T. DOM 1.0 is a REC, and the beast must be fed. As our published RD mentions, we're aiming for DOM 1.0 compatibility, but the Infoset will at least be able to distinguish what is required from what is optional (the DOM has been deliberately silent on that point, keeping the faith that some day there would be an Infoset). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 11:40:20 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: DOM CDATA vs Normalization In-Reply-To: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca> References: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca> Message-ID: <14074.8279.343988.805611@localhost.localdomain> Tim Bray writes: > At 11:55 AM 3/24/99 -0500, Bill la Forge wrote: > >Normalization of an element combines various text objects into a single > >text object. Does it then merge text and CDATA objects to a single object? > >And what about ignorable whitespace? > > XML does not repeat NOT have anything such as "ignorable" > whitespace. -Tim Tim's right -- SAX's terminology has thrown everything off. Something like "flagged whitespace" would have been better, but it would also have reminded people of the Seinfeld episode where George took the art book into the washroom... All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 11:55:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: XML and (K)Office In-Reply-To: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> Message-ID: <14074.8435.653789.348824@localhost.localdomain> Matthew Sergeant (EML) writes: [David] > > Anyway, let's get this right -- I think that it's healthy for > > both Gnumeric and the KOffice Spreadsheet program both to exist, > > but there is no excuse for them to use entirely incompatible > > formats. As a matter of fact, if we could convince KDE and Gnome > > to use compatible XML formats for lots of things (like interface > > construction), the media's predictions of a Linux fracture will > > be proven to be hot air. [Matt] > Although I agree to an extent, if they have different feature sets > it's pretty unlikely that you're going to get an entirely perfect > agreement on a spreadsheet DTD. I disagree *very* strongly -- with Namespaces, we can design a common format for the 90% of functionality that the two spreadsheets actually have in common (text cells, data cells, basic formulas, general formatting information [font, alignment, colour, size], etc.) and then allow each to provide extended information unambiguously-delimited through the use of separate namespaces. The more material in the common spec, the better interoperability. Linux needs to set an example here. > However, that's the beauty of XML. Writing a converter from one > format to another is trivial in the extreme, so it's not a huge > issue in my (humble) opinion. For n XML-based formats, we need (n * (n - 1)) converters. If there are only two different XML-based spreadsheet formats, then we need only two converters: a => b b => a If there are three XML-based different formats, then we need six converters: a => b a => c b => a b => c c => a c => b If there are four different XML-based formats, then we need twelve converters: a => b a => c a => d b => a b => c b => d c => a c => b c => d d => a d => b d => c Add a couple more, and the problem definitely isn't easy by any definition. Ten different XML-based formats requires 90 converters, and a change to only one of the formats will require changes to (2 * (n - 1)), or 18 converters! All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Mar 25 11:58:44 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:10:28 2004 Subject: Whence XQL? References: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com> Message-ID: <36FA24CE.C825D232@mitre.org> Have you looked at XML-QL? I have been playing around with this XML query tool for a few weeks. It's quite nice. It allows you to specify the grammer of extracted data, query multiple XML documents, etc. See: /Roger Ed Howland wrote: > > Ok, so now it is April (or thereabouts) and still no XQL. I've read all the > hypeware about this and I understand that its just a suggestion for a > proposal for a note for a draft for a recommendation. Whatever. > I want my XQL! > > Seriously, all ranting aside, I haven't seen any talk here or in XSL > listserv land about it yet (recently). The proposal seems complete enough to > me for someone to have at least announced a beta implementation of it. I'd > be happy with mostly unfinshed code if it were written in Java. > > So does anybody have a clue about this? I know about XSL so please don't > send me down that path. I also know about the Datachannel attempt. > > If not, I'm tempted to write something myself. I'm not sure about a couple > of things because its still just off the top of my head so far. > > Ok, its in Java. It uses some free XML parser, probably XML4J because its > the one I'm most familiar with. The XQL syntax parser will be written in > ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well > formed XML. This can be handled easily by XML4J's ability of any node in any > tree (or transformed sub-tree) to print itself in XML to any stream. XML4J > has a nice getNodesByName() mathod that can operate at any level of the tree > returning a NodeList of siblings with that tag name. Wrapping a result tree > in and iterating the NodeList gets you the > simplist query. > > Internally the result set is just another DOM tree so you should be able to > add the .jar file to your Java app and thus satisfy that type of XQL result. > The input can be done in a variety of ways. I assume that the Perl module > XML::XQL can be used in a CGI context to extract the XQL query, execute it > and return either XML or XSL transformed output to the calling app(browser.) > Likewise, a Java servlet could do the same thing. > > Cons: Xml4J doesn't yet handle PI's so its maybe not the overall best > solution. (I may be wrong about this, IBM uploaded a new major release that > may have fixed it.) Its just the one I'm comfortable with, at the moment. On > my hard drive are XML parsers from Sun, Microsoft, Oracle, James Clark and > one or two others I haven't had time to play with yet. > > I don't care about efficiency or optimization. All partially created result > sets will live in memory till they are ready to be output. I also don't care > about searching multiple files, although that should be realtively easy to > add. (I'm still confused about XML repositorys. Would XQL have to understand > directory paths? Does XQL need to be able to follow XLinks?) > > I'm leaving out sequences but I may add them in (much) later. Return values > (analog to SQL's SELECT) are important to my application, as are > conditionals. > > Unless someone warns me that I'm clueless (which is usually the case,) I'll > post a cut of the ANTLR grammer as soon as I get a working one. I'll > probably put it on my web site. > > Ed > > Ed Howland > ed@dega.com > http://www.dega.com > "As your attorney, I advise you to take some adrenalchrome" > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 12:01:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:28 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: References: <14068.24150.843634.988657@localhost.localdomain> Message-ID: <14074.9415.985411.394383@localhost.localdomain> Lars Marius Garshol writes: > http://xml.org/sax/features/normalize-text > Ensure that all consecutive text is returned in a single callback to > DocumentHandler.characters or DocumentHandler.ignorableWhitespace > (true) or explicitly do not require it (false). > > > This is potentially problematic, since it's unspecified what the > parser should do about CDATA sections in this case. (I suspect we will > see more problems of this kind when we start using really using and > stacking filters.) Should they be normalized, or should they be > reported separately? (Ie: what is consecutive text, exactly?) The same > problem appears with entity boundaries and character references. Thanks, Lars -- this is an excellent point. I think that the specification belongs, not with the normalize-text feature, but with the LexicalHandler (since people may define other types of handlers that we cannot predict). > Possible solutions: > > - reject normalize-text true if a LexicalHandler has been registered, > and reject LexicalHandler registration if normalize-text has been set > to true > - make normalize-text have a logical interpretation by default, and > switch to lexical if a LexicalHandler has been registered > - make normalize-text always have a lexical interpretation > - have separate normalize-text-logical and normalize-text-lexical > events, with reject-behaviour for the first The DOM's text-normalisation feature does *not* normalise CDATA sections, but I think that SAX's should. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Thu Mar 25 12:04:02 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:10:29 2004 Subject: Whence XQL? In-Reply-To: <14074.7638.522376.583407@localhost.localdomain> Message-ID: Shouldn't that be 'pedantically' :^) tim ______________________________ Reply Separator _________________________________ Subject: RE: Whence XQL? Author: david (david@megginson.com) at unix,mime Date: 25/03/99 11:34 I missed the start of this thread. Did the poster really want to know where XQL came from (whence), or was the poster interested in where it's going (whither)? Since SHAKESPEARE IN LOVE swept the Oscars, I expect people to get their 16th-century English usage right. Pedanticly yours, David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricker at xmls.com Thu Mar 25 12:57:04 1999 From: ricker at xmls.com (Jeffrey Ricker) Date: Mon Jun 7 17:10:29 2004 Subject: XML convertor generator In-Reply-To: <36F9B224.CE017C45@manhattanproject.com> References: <4.1.19990325091311.00ba01c0@steptwo.com.au> Message-ID: <199903251256.HAA27933@mail.his.com> Is this the sort of thing you are talking about? http://www.xmls.com/news/exeter.html At 03:48 AM 3/25/99 +0000, Clark Evans wrote: >James Robertson wrote: >| At 03:08 25/03/1999 , JPA wrote: >| | Hello, >| | >| | I'm currently working on an xml convertor-generator. When finished, the >| | tool will, if you take the bother to type the structure of your input >| | format and mappings on entities and attributes, construct a convertor. >| | There's no documentation as yet, and some stuff missing (escaping, for >| | one thing), but if there's enough interest I'll put it on a website as is. >| | >| | Paul Janssens - paul.janssens@skynet.be >| >| Paul, >| >| Not wishing to rain on your parade, but aren't >| you re-inventing the wheel here? > >Actually, a program which created an efficient >program to convert XML conforming to a specific >DTD to another product would be a very cool >invention, very different from using Perl >and/or Omnimark. > >I have Omnimark programs which take a great >deal of processing power (I'd hate to see the >Perl equivalent). Cutting it in half with a >program that generated a program would be >very cool indeed. What kind of 'efficiencies' >do you get when you remove the interpreted layer? > >I'm reading this that you are more or less >doing a YACC thing? Is this a correct >interpretation? Will it do SGML? >(I guess I can run it through nsgmls >to make the XML equivalent first.) >Is it open source? Hopefully it >will generate C code (for speed). > >Clark Evans > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Mar 25 13:01:56 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:29 2004 Subject: Whence XQL? Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1719@EUKBANT101> My problem with XML-QL was their use of tag minimisation (their proprietary syntax) means you can't parse XML-QL with an XML parser. That's foolish IMHO - if you're practically using XML already, why not reap the benefits? Anyway, there's an implementation of XML-QL in my directory on CPAN for perl users, which needs fixing up a little bit, but it's quite usable (if a little slow). It facilitates the use of perl's regexp syntax for queries as well as the system used by XML-QL, which makes it nice and powerful... Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Roger L. Costello [SMTP:costello@mitre.org] > Sent: Thursday, March 25, 1999 11:58 AM > To: Ed Howland > Cc: 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com' > Subject: Re: Whence XQL? > > Have you looked at XML-QL? I have been playing around with this XML > query tool for a few weeks. It's quite nice. It allows you to specify > the grammer of extracted data, query multiple XML documents, etc. See: > /Roger > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 13:20:33 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:29 2004 Subject: Whence XQL? In-Reply-To: References: <14074.7638.522376.583407@localhost.localdomain> Message-ID: <14074.14255.762607.581285@localhost.localdomain> [offline] Tim.Shaw@wdr.com writes: > Shouldn't that be 'pedantically' :^) Sixteenth-century spelling. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 13:31:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:29 2004 Subject: Megginson's Spelling Message-ID: <14074.14586.743793.79875@localhost.localdomain> To all those kind people who pointed out my alleged misspelling of 'pedanticly': 1. I've never been a competent speller (I was warned as an undergraduate that if I studied Medieval or Renaissance English with the original orthography I'd never be able to spell Modern English again). 2. I plan to claim that I was using a sixteenth-century spelling. I haven't actually found such a spelling used in the sixteenth century -- the earliest recorded usage of the word is from Brathwait in 1631, and he is already using the nouveau 'pedantically' spelling -- but I'll keep looking. 3. As Donne wrote (cited in the OED), "Busie old foole, unruly sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes" (see what I mean about spelling?). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 13:34:02 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:29 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <199903231453.JAA00219@ruby.ora.com> Message-ID: <001601be76c3$e3d91e70$0100007f@eps.inso.com> > Describing files in encodings other than US-ASCII or ISO 8859-1 (or > maybe other ISO 8859s) as text/anything is not a very good idea. The > rules for text/* allow many unhealthy things; 8-bit data is not even a > safe assumption, and line-end normalization can be a killer. The > fallback rules for MIME's two-level hierarchy is only the final straw; > for non-European encodings, I would use application/xml. HTTP specifically ignores some things required by MIME, so the above is only an issue in mail. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 13:34:37 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:29 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <001901be7616$8e4caba0$c8a8a8c0@thing1> Message-ID: <001401be76c3$e0fd27a0$0100007f@eps.inso.com> > From: Gavin Thomas Nicol > >CDATA sections *are* different from normal text, even if only > >because the author used them. > > Again, is anyone aware of why CDATA is preserved by the DOM? > What was the reasoning behind this decision? Other things, like > whitespace within an element tag or even attribute order, are > not preserved. Why then was CDATA? Because whitespace within elements is not significant markup, nor is attribute ordering (though we did have a number of debates over whether attribute ordering information should be available). Unlike these, CDATA is *explicit* markup. For many purposes, you don't need to know about it, but you cannot simply remove it, because you cannot know why an author put it there. Removing CDATA would fail the test of least surprise. Speaking of which, I am continually surprised by SAX's lack of comment interfaces.... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 13:35:21 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:29 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <36F755CB.C996CE2D@w3.org> Message-ID: <001501be76c3$e22d4330$0100007f@eps.inso.com> > So, in consequence: example file such as the Chinese XML examples at > http://xml.ascc.net/xml/test/index.html (where each example > is available in UTF-8, Big5 and GB2312, all correctly labelled in the XML encoding > declaration) are now sets of invalid XML files which are required to > produce a critical error because of the invalid byte sequences in what > is now described as a US-ASCII file? > > This is deeply counterproductive, and could have been avoided. No. Servers should be configured to label the document correctly. The HTTP 1.1 specification clearly states that any document using an encoding other than ISO 8859-1 should have a correct, corresponding charset parameter. As such, for the documents above, using HTTP, you would have a protocol error. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Mar 25 13:47:28 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:10:29 2004 Subject: XML-QL (was Re: Whence XQL?) References: <5F052F2A01FBD11184F00008C7A4A800022A1719@EUKBANT101> Message-ID: <36FA3E97.38E67BC1@mitre.org> Matthew Sergeant (EML) wrote: > > My problem with XML-QL was their use of tag minimisation (their proprietary > syntax) means you can't parse XML-QL with an XML parser. That's foolish > IMHO - if you're practically using XML already, why not reap the benefits? Hi Matt, Not sure that you could do all the things that XML-QL allows you to do if you stick to the XML syntax. Example, query the following XML document for all part names: ]> Green Power Juicer Green Power Toyota Tercel Toyota Sony Stereo X11-3 Sony Note the recursive definition of the part element. Thus, the part name can be at any nesting level. Here's how to do it using XML-QL: function AllPartNamesQuery () { // Source: Parts.xml // Find the names of all the parts construct $name where $name IN "Parts.xml" } How would you do this using XML syntax? /Roger > > Anyway, there's an implementation of XML-QL in my directory on CPAN for perl > users, which needs fixing up a little bit, but it's quite usable (if a > little slow). It facilitates the use of perl's regexp syntax for queries as > well as the system used by XML-QL, which makes it nice and powerful... > > Matt. > -- > http://come.to/fastnet > Perl on Win32, PerlScript, ASP, Database, XML > GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V > !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > > > -----Original Message----- > > From: Roger L. Costello [SMTP:costello@mitre.org] > > Sent: Thursday, March 25, 1999 11:58 AM > > To: Ed Howland > > Cc: 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com' > > Subject: Re: Whence XQL? > > > > Have you looked at XML-QL? I have been playing around with this XML > > query tool for a few weeks. It's quite nice. It allows you to specify > > the grammer of extracted data, query multiple XML documents, etc. See: > > /Roger > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 14:17:07 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:29 2004 Subject: SAX2: AttributeList2 and EntityRefList Message-ID: <14074.16928.163619.681099@localhost.localdomain> While we're polishing the details of LexicalHandler (which may yet become DocumentHandler2 -- I'm still listening to arguments both ways), I'd like to propose two new SAX2 support interfaces. EntityRefList ------------- This first interface is designed to work around a *very* nasty problem with XML 1.0 conformance, and at the same time, to enable the tracking of entity references in attribute values for the few masochists who care. As John Cowan has pointed out, the XML 1.0 REC requires that processors report unexpanded entity references, and presumably that applies to references in attribute values as well as elsewhere; as a result, it is impossible to treat an XML attribute value simply as a string. On the other hand, almost nobody will every need this, so it's not worth complicating the interface much for parser writers or for application writers. So, after some thought, here's what I came up with. This is a special interface providing indexes to zero or more entity references in a literal string (i.e. an attribute value). The indices are based on whatever array indices the programming language is using, exclusive of Unicode problems with combining characters, etc. (i.e. any normalisation must already have taken place). ====================8<====================8<==================== // EntityRefList.java - list entity references in an attribute value. package org.xml.sax; public interface EntityRefList { public int getLength (); public String getEntityName (int i); public int getEntityRefStart (int i); public int getEntityRefEnd (int i); } // end of EntityRefList.java ====================8<====================8<==================== The nice thing is that this lives outside of the String representing the attribute value, so almost everyone can ignore it, and there should be no performance hit. It also provides nice backwards-compatibility with SAX 1.0. AttributeList2 -------------- Here's what I've come up with for lexical attribute information in SAX2: ====================8<====================8<==================== // AttributeList2.java - SAX2 extensions for an attribute list package org.xml.sax; public interface AttributeList2 extends AttributeList { public boolean isSpecified (int index); public boolean isSpecified (String name); public EntityRefList getEntityRefList (int index); public EntityRefList getEntityRefList (String name); } // end of AttributeList2.java ====================8<====================8<==================== This, together with the DTDDeclHandler interface I'll be describing in a separate posting, should provide enough information for full DOM level one core attribute support. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 14:21:39 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:29 2004 Subject: SAX2: DTDDeclHandler (minimalist position) Message-ID: <14074.17776.784121.47587@localhost.localdomain> Here's the second of the three new core handler types I'm proposing for SAX2. This handler takes a minimalist position: it provides about enough information for DOM support, but not much more. In particular, I'm still shying away from reporting element-type declarations, at least until someone shows me an easy and concise way of doing it (in AElfred, I simply provided the content model as a fully-normalised string). ====================8<====================8<==================== // DTDDeclHandler.java -- receive extended DTD declarations package org.xml.sax; public interface DTDDeclHandler { public final static int ATTRIBUTE_DEFAULTED = 1; public final static int ATTRIBUTE_IMPLIED = 2; public final static int ATTRIBUTE_REQUIRED = 3; public final static int ATTRIBUTE_FIXED = 4; public abstract void attributeDecl (String element, String name, String type, String defaultValue, int defaultType, EntityRefList entityRefs) throws SAXException; public abstract void externalEntityDecl (String name, boolean isParameterEntity, String publicId, String systemId) throws SAXException; public abstract void internalEntityDecl (String name, boolean isParameterEntity, String value) throws SAXException; } // end of DTDDeclHandler.java ====================8<====================8<==================== All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From indiketr at churchill.co.uk Thu Mar 25 14:35:09 1999 From: indiketr at churchill.co.uk (Rajeeva Indiketiya) Date: Mon Jun 7 17:10:29 2004 Subject: unsubscribe xml-dev Message-ID: unsubscribe xml-dev indiketr@churchill.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Thu Mar 25 14:51:19 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:10:30 2004 Subject: XML and (K)Office In-Reply-To: <199903242256.XAA04448@sonne.darmstadt.gmd.de> (macherius@darmstadt.gmd.de) Message-ID: <199903251448.JAA00778@ruby.ora.com> [Ingo Macherius] > David Megginson wrote at 24 Mar 99, 17:02: > > > There's also a hot rumour [3] that Microsoft has assigned 37 > > programmers to work on a Linux port of MS Office. Soem quick research on slashdot shows the rumor's evolution. The first sighting appears to be on ZDnet; they reported that Simson Garfinkle, a _Boston Globe_ columnist and technology writer, mentioned on a radio show that he was in correspondence with some of the developers. But even if that's true, I can think of a number of reasons why Microsoft might be doing a port internally with no intentions whatsoever of releasing it. The ZDnet article notes that Office relies heavily on MS's undocumented Win32 API calls, and just porting the app to the standard API calls which could then be handled in emulation on Linux would be a major chore. Some URLs: Linkname: Slashdot:Search URL: http://www.slashdot.org/search.pl?topic=microsoft Linkname: Slashdot:MS Office for Linux URL: http://www.slashdot.org/articles/99/03/11/2327241.shtml Linkname: ZDNN: MS porting Office to Linux? URL: http://www.zdnet.com/zdnn/stories/news/0,4586,2224863,00.html -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Thu Mar 25 15:03:48 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:30 2004 Subject: Whence XQL? In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.deg a.com> Message-ID: <3.0.3.32.19990325100328.00c90e50@pop.mindspring.com> At 07:24 PM 3/24/99 -0800, Ed Howland wrote: >Ok, so now it is April (or thereabouts) and still no XQL. I've read all the >hypeware about this and I understand that its just a suggestion for a >proposal for a note for a draft for a recommendation. Whatever. >I want my XQL! XQL has been implemented by several vendors. I know of five implementations that are commercially or publically available: o Microsoft's Internet Explorer 5.0 browser o webMethod's B2B Integration Server o DataChannel's Rio o ObjectStore's eXcelon o Perl library by Eduard Derksen (Enno) that can be found in the CPAN archive Software AG has been showing a subset of XQL in it's Tamino product at CeBIT. There are several other prototypes that I know of, but I don't know whether I can mention them. >If not, I'm tempted to write something myself. I'm not sure about a couple >of things because its still just off the top of my head so far. > >Ok, its in Java. It uses some free XML parser, probably XML4J because its >the one I'm most familiar with. That would be great, and I'd be glad to answer questions you have as you go along. I would really like to see something like that. I have thought of setting up a mailing list for XQL, which might be a good place to help implementors communicate with each other. >The XQL syntax parser will be written in >ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well >formed XML. Careful...there's a very fine distinction here. The result set of XQL is actually a set of nodes in the tree. When this is returned as an ASCII result, it is wrapped in an tag to make it well formed. This distinction is important because nodes have identity, and XML text does not. >This can be handled easily by XML4J's ability of any node in any >tree (or transformed sub-tree) to print itself in XML to any stream. XML4J >has a nice getNodesByName() mathod that can operate at any level of the tree >returning a NodeList of siblings with that tag name. Wrapping a result tree >in and iterating the NodeList gets you the >simplist query. This is a good approach. >I don't care about efficiency or optimization. All partially created result >sets will live in memory till they are ready to be output. I also don't care >about searching multiple files, although that should be realtively easy to >add. (I'm still confused about XML repositorys. Would XQL have to understand >directory paths? Does XQL need to be able to follow XLinks?) There are several ways of doing repositories, and that will depend somewhat on the repository vendor. Directory paths are useful - I think the easiest way to do this is to use a URL to specify the resource, and put the directory path in the URL. The XQL query can be appended to the end of the URL. >I'm leaving out sequences but I may add them in (much) later. Return values >(analog to SQL's SELECT) are important to my application, as are >conditionals. Cool! Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Mar 25 15:09:47 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:10:30 2004 Subject: XML conference Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EB9@MDYNYCMSX1> >Is anyone on this list intend on doing something interesting at the >conference? I'll be doing an overview of the four schema proposals submitted to the W3C. You gotta love the conference's clever domain name: www.xmlconference.com. Although, as you pointed out, it the web page doesn't tell us much--in fact, it asks for far more information than it offers. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Thu Mar 25 15:24:17 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:30 2004 Subject: Whence XQL? In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.deg a.com> Message-ID: <3.0.3.32.19990325102457.03222100@pop.mindspring.com> At 07:24 PM 3/24/99 -0800, Ed Howland wrote: >Ok, so now it is April (or thereabouts) and still no XQL. I just found another utility that is based on XQL: http://www.cs.york.ac.uk/fp/Xtract/ The author describes it thus: "Xtract is a command-line tool for searching XML documents. Just as `grep' returns lines which match your regular expression, so Xtract returns all those sub-trees from XML documents which match a query pattern. The query expression language is simple but powerful, and is based loosely on XQL, the recently proposed XML Query Language. An introduction to the Xtract query pattern language, together with the full Xtract grammar is in this tutorial." "The major difference from XQL is that a query must return a sequence of XML contents (either elements or text inside an element): it cannot for instance return just an attribute value." Looks useful. Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 25 15:30:24 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:30 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <01BE76DC.A75E4B50@grappa.ito.tu-darmstadt.de> David Megginson wrote: > > Possible solutions: > > > > - reject normalize-text true if a LexicalHandler has been registered, > > and reject LexicalHandler registration if normalize-text has been set > > to true > > - make normalize-text have a logical interpretation by default, and > > switch to lexical if a LexicalHandler has been registered > > - make normalize-text always have a lexical interpretation > > - have separate normalize-text-logical and normalize-text-lexical > > events, with reject-behaviour for the first > > The DOM's text-normalisation feature does *not* normalise CDATA > sections, but I think that SAX's should Do you mean always normalize CDATA or normalize CDATA in the absense of a LexicalHandler? I agree with the first case, but prefer Lars' second option (lexical interpretation of normalization) in the second case. By requesting normalize, the application has asked for a single call to character() between other calls. By registering a LexicalHandler, the application has stated it is still interested in lexical events. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Mar 25 15:38:09 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:10:30 2004 Subject: XML DTD to relational Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EBB@MDYNYCMSX1> >I would like to know if there is a Java library that creates a >relational schema from an XML DTD or a Java library that parses >an XML DTD? The latter: http://www.javareport.com/html/products/prod_rev.shtml has a review of the popular Java XML parsers. The article admitted to being out-of-date as soon as it was printed, so most of the parsers have been updated since being reviewed. The former: part of the point of XML and SGML is their ability to indicate structure in data that doesn't fit neatly into normalized rows and tables. The information is structured hierarchically. I know that Information Builders, a former employer of mine (and Chet Ensign's), has products that map schema back and forth between relational databases and their hierarchically-based database products, so the algorithms exist, but it's not trivial. I haven't heard of anything that does this with DTDs, but it would be cool. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jabuss at cessna.textron.com Thu Mar 25 15:46:32 1999 From: jabuss at cessna.textron.com (Buss, Jason A) Date: Mon Jun 7 17:10:30 2004 Subject: Whence XQL? Message-ID: I thought the tag minimization syntax () was a part of the XML recommendation... Or am I wrong? > -----Original Message----- > From: Matthew Sergeant (EML) [SMTP:Matthew.Sergeant@eml.ericsson.se] > > My problem with XML-QL was their use of tag minimisation (their > proprietary > syntax) means you can't parse XML-QL with an XML parser. That's > foolish > IMHO - if you're practically using XML already, why not reap the benefits? > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 25 16:19:42 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:30 2004 Subject: How about changing the rules? Message-ID: Hi, Yesterday night I talked to good friends that work at Netscape (but not for long now) and I can tell you that this was not about celebrating. We came to discuss about the free software movement on so on, then came an idea... Several people worked hard in the Linux project, then came Red Hat, big investments, and now red hat is doing what all the other guys are doing (that's business no?) protecting their turf and doing money (they are even more luky than SUN or Microsoft, they are cheap labor to develop their software - just think about it. We all know that Microsoft has probably the lowest developement cost in the industry. They let the stock market pay their exployees :-) but now think about a company having 0$ developement costs Wow, thats VC dream! Follow developers, is it how you pay your bills? Sun still own the Java JDK but at least played fair because the code is developed with their own money. Microsoft, played hard with all ISVs with their huge appetite for growth but at least, like sun paid their code production. Mozilla, again, people working for free and AOL and its stock holders harvest the results. Just imagine that Sun and adobe put 60 000$ to have a better XML support for Mozilla. But in the end who will get the millions rewards. And how much is 60 000$ compared to millions, just a sustenance given to developers like lord would do in the middle ages with their serf. Just think about it. I am not saying that Sun or Adobe are doing something wrong but that the rules of the games or the odds are for the bank, not for the developers :-) (if you allow my casino analogy). Basically the actual free software movement seems to follow this pattern: developers work for free (cheap labor), when testing and proof of concept is done, someone comes into and reap the rewards and the money. Result, developers got fun but a modern version of a lord reap the financial rewards. Do we really want to replicate middle ages patterns? Next year will be the next millenium, do you really want that kind of order in the future? What about a world where people could get a just reward for their efforts. All the efforts we are doing with XML may end up the same way. I do not speak here for people already paid by W3C or big corpora but about individual doing all the efforts with their own time, and therefore their own money. Here's the solution that friends and me came about. Create a company where all participating developers would have stocks. Will work like open software group but each participant would have ownership. Customers would get a share too. In this case, we do like Red hat is doing, packaging the code make it easy to install, document it and _sell_ it. Each customer would have a stock too. So, when they buy the software, they also have ownership. So, the idea is: create a company where all participating developers would have stocks and therefore ownership. Customers would also have stocks and ownership but would have to buy the software to get ownership. A free version could be downloaded for free trial. But people using the free trial version would not have stocks. Results: This time, developers could get a chance to get a return on their efforts. Just imagine the power of a company having 20 000 owners. As big as Microsoft! Couple years ago, a group of artist came tired of seeing someone else get all the rewards of their work and then founded United Artist. Then now, today, what about a new company called "United Developers". If the idea seems interesting to you, we can start a list server to discuss about it and create a new kind of company. Again imagine what 20 000 ,50 000 or even millions of owners can do. Just stop for a moment and think about it. If you don't want to pollute this list with comments about this, just email me and we'll start a list just for this. I hope, we could lay grounds for the next century with a new kind of business created from the new economy fuel, not capital but knowledge and the capacity to produce something with it. A company having owners located in all parts of the world. Just think about it, we may have the power to build a better future and maybe a model for the next knowledge worker generation. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Mar 25 16:28:03 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:30 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <14074.17776.784121.47587@localhost.localdomain> Message-ID: <199903251626.LAA12547@hesketh.net> At 09:21 AM 3/25/99 -0500, David Megginson wrote: >Here's the second of the three new core handler types I'm proposing >for SAX2. This handler takes a minimalist position: it provides >about enough information for DOM support, but not much more. In >particular, I'm still shying away from reporting element-type >declarations, at least until someone shows me an easy and concise way >of doing it (in AElfred, I simply provided the content model as a >fully-normalised string). A fully-normalized string is fine with me - I'd rather get it as a string and parse it myself than have to deal with something freaky a parser developer really didn't want to have to code anyway. But this info is NECESSARY if anyone (me in particular) wants to build a validation engine that lives outside the core parser. How about: public abstract void elementDecl (String name, String contentModel) throws SAXException; I like it, anyway. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Thu Mar 25 16:32:10 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:30 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 References: <14068.24150.843634.988657@localhost.localdomain> <14074.9415.985411.394383@localhost.localdomain> Message-ID: <36FA63C8.9C5440B7@jfinity.com> David Megginson wrote: > Lars Marius Garshol writes: > > The DOM's text-normalisation feature does *not* normalise CDATA > sections, but I think that SAX's should. > Are there other cases (other than text-normalization ) in SAX2 that require the parser to aggregate notifications and save state (other than that required for well-formedness checking)? To say it a different way, are there other examples of SAX2 providing a high(er) level service on behalf of the applications other than raw notification of lexical and structural events? My impression is that SAX(2) is intended to be minimalist. If a filter network can be composed on top of SAX2 that provides the desired capabilities, then SAX2 doesn't need to provide that capability. If there are multiple variations in how the desired capability can be provided (as in the normalization example), then this is an even better indicator that it should be left to a "policy" decision at a high layer. Maybe normalization is a good candidate for an example filter network. The fact that it would need to be configureable (concerning CDATA handling) might make it a more useful pedagogical aid. Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Mar 25 16:47:41 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:10:31 2004 Subject: Round-trip issues Message-ID: <8525673F.005C0447.00@D51MTA03.pok.ibm.com> Gathering and replying to several comments: >By the same argument, >

x="1"> >and >

> are different, because the author used them. The concept of ignorable whitespace also permits individual applications to _not_ ignore it, I believe. >Again, is anyone aware of why CDATA is preserved by the DOM? CDATA exists in the first place because some folks are working with applications that are displeased by having to juggle character-entity references when representing textual data that conflicts with XML syntax. Consider an XML editor which is creating an XHTML page with embedded dynamic scripting. One can argue that outputting a BTW, to answer another question: The DOM does not apply normalization to CDATASections, either with adjacent text or between themselves. I believe that's the only behavioral difference inside the DOM between these and standard text nodes. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dante at mstirling.gsfc.nasa.gov Thu Mar 25 16:52:28 1999 From: dante at mstirling.gsfc.nasa.gov (Dante Lee) Date: Mon Jun 7 17:10:31 2004 Subject: KOML Question Message-ID: Does anyone know where I can find an example of either Java KOML or XML Serialization code? Please reply asap. Dante M. Lee Code 588 NASA/GSFC Greenbelt MD 20771 Voice = 301-521-1077 Bldg = 23 Rm = W415 Email = dante@mstirling.gsfc.nasa.gov dante4@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Mar 25 17:25:35 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:31 2004 Subject: Whence XQL? In-Reply-To: Message-ID: <199903251723.MAA13808@hesketh.net> At 09:44 AM 3/25/99 -0600, Buss, Jason A wrote: >I thought the tag minimization syntax () was a part of the XML >recommendation... Or am I wrong? Well, Microsoft seemed to think so for a while - but it's definitely _not_ part of the Rec. See http://www.lists.ic.ac.uk/hypermail/xml-dev/9711/index.html, the " as end tag" thread. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RDaniel at DATAFUSION.net Thu Mar 25 17:27:45 1999 From: RDaniel at DATAFUSION.net (Ron Daniel) Date: Mon Jun 7 17:10:31 2004 Subject: SAX2: DTDDeclHandler (minimalist position) Message-ID: <0D611E39F997D0119F9100A0C931315C52F72D@datafusionnt1> I agree with Simon that getting the content model as a string is a reasonable choice. A seperate interface could be put into the helpers package for help in parsing the content model, but lets keep that out of the critical path. Ron > -----Original Message----- > From: Simon St.Laurent [SMTP:simonstl@simonstl.com] > Sent: Thursday, March 25, 1999 8:29 AM > To: David Megginson; XML Developers' List > Subject: Re: SAX2: DTDDeclHandler (minimalist position) > > At 09:21 AM 3/25/99 -0500, David Megginson wrote: > >Here's the second of the three new core handler types I'm proposing > >for SAX2. This handler takes a minimalist position: it provides > >about enough information for DOM support, but not much more. In > >particular, I'm still shying away from reporting element-type > >declarations, at least until someone shows me an easy and concise way > >of doing it (in AElfred, I simply provided the content model as a > >fully-normalised string). > > A fully-normalized string is fine with me - I'd rather get it as a > string > and parse it myself than have to deal with something freaky a parser > developer really didn't want to have to code anyway. But this info is > NECESSARY if anyone (me in particular) wants to build a validation > engine > that lives outside the core parser. > > How about: > > public abstract void elementDecl (String name, > String contentModel) > throws SAXException; > > I like it, anyway. > > Simon St.Laurent > XML: A Primer > Sharing Bandwidth / Cookies > http://www.simonstl.com > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Mar 25 18:06:58 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:31 2004 Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandlerdraft v.1.1) References: <3.0.32.19990324145555.00e6dd44@pop.intergate.bc.ca> Message-ID: <36FA759C.82E60A19@prescod.net> Tim Bray wrote: > > At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote: > >I wasn't even going to reply, but then I remembered that the real question > >here is whether SAX (not the DOM) should tell people about CDATA sections. > > I think the answer is yes. > > The implication is that a parser that doesn't pass on word of CDATA > sections is a second-rate parser. Hrummph. It isn't second-rate it is probably just optimized for speed instead of fidelity. > Is this not a slippery- > slope that puts us on the road to reporting whether single or double > quotes were used for attribute values? -Tim The way to avoid the slippery slope is to define an information set. Had the information set been defined before the DOM (or, even better, before XML 1.0 went to REC) then the DOM creators would have known what the right answer is. In this case they were forced to guess and IMHO they guessed wrong. Lesson: Information sets should follow close on the heals of syntactic standards or should be incorporated into the syntactic standards. RDF gets this right. Will XLink? What about future versions of CSS? Also: Different types of applications need different amounts of information. Therefore an information set should support different levels of granularity. The groves model does this through "grove plans." Some parsers provide grove plans that allow a character-for-character round-tripping. Others provide what we used to call "ESIS." -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 18:15:25 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:31 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <36FA63C8.9C5440B7@jfinity.com> References: <14068.24150.843634.988657@localhost.localdomain> <14074.9415.985411.394383@localhost.localdomain> <36FA63C8.9C5440B7@jfinity.com> Message-ID: * Gabe Beged-Dov | | Are there other cases (other than text-normalization ) in SAX2 that | require the parser to aggregate notifications and save state (other | than that required for well-formedness checking)? Not right now, no. | My impression is that SAX(2) is intended to be minimalist. If a | filter network can be composed on top of SAX2 that provides the | desired capabilities, then SAX2 doesn't need to provide that | capability. This is true, and personally I was of the opinion that normalize-text was better left to external filters. | Maybe normalization is a good candidate for an example filter | network. Maybe, but we still need to reject invalid combinations of LexicalHandler and normalize-text, so somehow the filter and/or parser will need to handle this. This whole issue just strengthens my conviction that we need to specify filter handling within the SAX2 core. This will need to deal very carefully with parser encapsulation for approaches like this one to be really feasible in implementations. It wouldn't be much fun if a filter did the normalization without telling the parser and thus caused trouble with the LexicalHandler, and no hint of this trouble ever reaching the application. | The fact that it would need to be configureable (concerning CDATA | handling) might make it a more useful pedagogical aid. As to how filters work, you mean? Well, we should let ourselves be affected by that. And, besides, whitespace normalization is a far better example, I think. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Thu Mar 25 18:24:39 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:31 2004 Subject: Whence XQL? Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1D9@RED-MSG-08> End tag minimization ("") is not part of XML. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 18:30:01 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:31 2004 Subject: ModSAX: Proposed Core Properties In-Reply-To: <14064.6789.718797.734226@localhost.localdomain> References: <01BE705C.DB375010@grappa.ito.tu-darmstadt.de> <14064.6789.718797.734226@localhost.localdomain> Message-ID: * Ronald Bourret | | What is the possible benefit of making any property write-only? | That is, can any harm ever come from reading a property? * David Megginson | | There are three benefits: | | 1. Keep the API absolutely as small as possible. | 2. Avoid confusion. | 3. Allow properties to be unknown until set. These are all real benefits, but the disadvantage is rather large I'm afraid: it makes assembling a processing solution from reusable components much more difficult, since one component can't learn how the others have modified the parser settings. I think we should make all properties readable, which means we split them into read-write/read-only properties. This should maintain benefits 1 and 2 even better than the write-only/read-only split, since most people probably expect read-write/read-only properties like e.g CORBA attributes. I also think we should go even further and make all features readable, so that a filter can see if a feature has been enabled or not. Without knowing the exact set of features I think disabling reading is potentially very limiting. | Any attempt to access a property can generate a | SAXNotSupportedException (or the derived SAXNotRecognizedException), | but there is no guarantee that they will be symmetrical. Maybe we should have a SAXInvalidValueException too, so that the parser/filters can reject invalid values without risking misinterpretation on the part of applications/filters? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 18:35:20 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:31 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <36FA89B1.D338ED41@lig.net> I know, I know, this is anathema to what many of you feel is the essence of XML, and I agree to a point. I have come to feel however that there is room for a "works-as-if" binary analogue to text based XML. Something that is totally subservient to the standard and has exactly equivalent features, but that is highly efficient for processing at all levels and easily converted to and from text based XML. In using XML in real-world application work and designing future infrastructure that is highly scalable and efficient while making use of XML, I have come to the conclusion that I need a standard way to deal with an XML analogue that is binary. There are a multitude of performance problems that this solves, not only in parsing and exporting, but processing of related data inside applications. Before I make all the details and ideas public, I would like to know if there is any serious precedent directly dealing with XML. My design has highly efficient Java processing in mind, but is not specific to any particular language. Compression is a secondary, but associated issue. Thanks sdw -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 18:35:56 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:31 2004 Subject: Parser2 modification Message-ID: public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; I think we should allow this method to throw a SAXInvalidConfigurationException, to be used to solve the LexicalHandler/normalize-text problem and similar problems that may appear with other non-core handlers. I suggested about 10 seconds ago a SAXInvalidValueException, and I suppose these two could be merged, since they are roughly the same thing. public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; Also, I think this method should be allowed to throw the SAXInvalidConfigurationException so that it can complain about things like non-existent catalog files, catalog files with syntax errors etc Or maybe something even more generic would be the best. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Thu Mar 25 19:02:41 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:10:31 2004 Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave) Message-ID: <30649320C177D111ADEC00A024E9F297169FC0@exchange-server.dega.com> I took a deeper look at XML-QL and at first glance, it appears to be a stronger syntax. I'm happy that AT&T chose Java to implement a prototype implementation. I'm hoping I can embed the engine in my Java code. The XQL syntax is rather cryptic, but familiar to XSL writers, and seems to have the benefit of being embeddable. The XQL authors chose to leave source and destination streams and formats up to implementers. These two things appealed to me initially. I could embed an XQL engine in a Java servlet and send it querys from ECMAScript. Internally, it makes a DOM tree which could be transformed via XSL before being returned to the browser. Since all of these pieces were in Java it seemed a powerful combination. (I know that some things can be done in XSL, but I needed some of the extensions that XQL provided.) BTW, how do you get at the XQL part of IE5? I never saw that in MS's writeup. Or is it just extensions to their XSL? In the meantime, while looking at XML-QL for our short term needs, I'll continue to work on XQL using Java and ANTLR. For now that is taking two concurrant directions: The full XQL ANTLR grammer will proceed as usual producing, eventually a parser that while it recognizes valid XQL, does nothing more than genrate a abstract syntax tree. The other direction is to generate a valid subset that actually parses, works on an internal DOM tree genreated via XML4J, and outputs XML. The first cut of this will only do path expressions of the form: element/*/sub-element//leaf-element Ed P.S. I found it funny that Roger's example data closely matches my own. Onw wonders... Ed Howland ed@dega.com http://www.dega.com "As your attorney, I advise you to take some adrenalchrome" -----Original Message----- From: Roger L. Costello [mailto:costello@mitre.org] Sent: Thursday, March 25, 1999 5:48 AM To: Matthew Sergeant (EML) Cc: 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'; mff@research.att.com Subject: XML-QL (was Re: Whence XQL?) Matthew Sergeant (EML) wrote: > > My problem with XML-QL was their use of tag minimisation (their proprietary > syntax) means you can't parse XML-QL with an XML parser. That's foolish > IMHO - if you're practically using XML already, why not reap the benefits? Hi Matt, Not sure that you could do all the things that XML-QL allows you to do if you stick to the XML syntax. Example, query the following XML document for all part names: ]> Green Power Juicer Green Power Toyota Tercel Toyota Sony Stereo X11-3 Sony Note the recursive definition of the part element. Thus, the part name can be at any nesting level. Here's how to do it using XML-QL: function AllPartNamesQuery () { // Source: Parts.xml // Find the names of all the parts construct $name where $name IN "Parts.xml" } How would you do this using XML syntax? /Roger > > Anyway, there's an implementation of XML-QL in my directory on CPAN for perl > users, which needs fixing up a little bit, but it's quite usable (if a > little slow). It facilitates the use of perl's regexp syntax for queries as > well as the system used by XML-QL, which makes it nice and powerful... > > Matt. > -- > http://come.to/fastnet > Perl on Win32, PerlScript, ASP, Database, XML > GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V > !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > > > -----Original Message----- > > From: Roger L. Costello [SMTP:costello@mitre.org] > > Sent: Thursday, March 25, 1999 11:58 AM > > To: Ed Howland > > Cc: 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com' > > Subject: Re: Whence XQL? > > > > Have you looked at XML-QL? I have been playing around with this XML > > query tool for a few weeks. It's quite nice. It allows you to specify > > the grammer of extracted data, query multiple XML documents, etc. See: > > /Roger > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ramin at wizen.com Thu Mar 25 19:11:24 1999 From: ramin at wizen.com (Ramin Firoozye) Date: Mon Jun 7 17:10:31 2004 Subject: The horror... Message-ID: <000b01be76f2$033f5d00$a432c9cf@lust.wizen.com> Hi folks... Sorry to clutter the list with this ... I was trying to be a good netizen and reply privately back to the authors of some postings on XML-DEV. To my horror, the messages popped back into my XML-DEV mailbox. I can't tell if this is a function of the new mail client I've been using or I am in-fact replying back to the whole list. If the latter, please accept my abject apologies. If you haven't gotten a barrage of mail from me then ignore this message (phew). Thanks, Ramin -- Ramin Firoozye - Wizen Software. San Francisco, California. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Thu Mar 25 19:34:51 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:31 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 References: <14068.24150.843634.988657@localhost.localdomain> <14074.9415.985411.394383@localhost.localdomain> <36FA63C8.9C5440B7@jfinity.com> Message-ID: <36FA8EB0.69B4A18C@jfinity.com> Lars Marius Garshol wrote: > This whole issue just strengthens my conviction that we need to > specify filter handling within the SAX2 core. This will need to deal > very carefully with parser encapsulation for approaches like this one > to be really feasible in implementations. > > It wouldn't be much fun if a filter did the normalization without > telling the parser and thus caused trouble with the LexicalHandler, > and no hint of this trouble ever reaching the application. The SAX filter registration interfaces don't allow multiple filters to be registered for the same feature. I don't see how you can have more than one owner for all the registered handlers without getting into severe trouble. This is not a bad thing as you probably want something like MDSAX to handle the filter networks. It allows SAX(2) to be lean and mean. If the same logic is managing all the callbacks, it can gracefully handle the variations of CDATA notification (or lack thereof) and text-normalization on behalf of the client of the filter network. > | The fact that it would need to be configureable (concerning CDATA > | handling) might make it a more useful pedagogical aid. > > As to how filters work, you mean? Well, we should let ourselves be > affected by that. And, besides, whitespace normalization is a far > better example, I think. Does whitespace normalization require aggregating notifications or just scrubbing the contents of a particular notification? Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 20:02:29 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:31 2004 Subject: Elements-Attributes-Data (was RE: SAX2 RFD: LexicalHandler draft v.1.1) In-Reply-To: <001401be76c3$e0fd27a0$0100007f@eps.inso.com> References: <001901be7616$8e4caba0$c8a8a8c0@thing1> <001401be76c3$e0fd27a0$0100007f@eps.inso.com> Message-ID: <14074.38225.755412.932105@localhost.localdomain> Gavin Thomas Nicol writes: > Speaking of which, I am continually surprised by SAX's lack of > comment interfaces.... SAX was originally designed specifically for production use, not for authoring (to meet the 80/20, or in this case, the 98/2 rule). That said, comments will be there in SAX2 in the (optional) LexicalHandler for people who want them, but the lack of comment and CDATA interfaces have certainly not hindered the SAX application base so far. There are only three things that most XML applications need to know about: 1. Elements 2. Attributes 3. Character Data It makes a nice little litany: elements-attributes-data, elements-attributes-data, elements-attributes-data, elements-attributes-data. Yes, XML really is/should be that easy. Actually, apps need to know about error messages too, but that wrecks the litany. Everything else should be taken care of invisibly by the parser. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 20:05:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <199903251626.LAA12547@hesketh.net> References: <14074.17776.784121.47587@localhost.localdomain> <199903251626.LAA12547@hesketh.net> Message-ID: <14074.38525.185285.772426@localhost.localdomain> Simon St.Laurent writes: > How about: > > public abstract void elementDecl (String name, > String contentModel) > throws SAXException; > > I like it, anyway. You know, it doesn't inhale (I was going to say 'suck', but I know that my American cousins are still a little sensitive after the recent impeachment trial). It's easy enough to parse the normalised content model if you really need to: it would be all one string, with no parameter entity references. Of course, people will rightly complain that the processor has already done the work of parsing it. It's hard to know what to do here. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Mar 25 20:34:05 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:10:32 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> >I know, I know, this is anathema to what many of you feel is the >essence of XML, and I agree to a point. It's not so much about feelings, as about contradicting the XML spec. >From 1. Introduction (http://www.w3.org/TR/REC-xml#sec-intro): "XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data, and some of which form markup." ("characters" there links to http://www.w3.org/TR/REC-xml#dt-character:) A parsed entity contains text, a sequence of characters, which may represent markup or character data. A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646." Applying XML concepts to a binary data format sounds interesting and potentially useful, but it wouldn't be XML. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 25 20:41:24 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <013901be7700$96dfb060$c8a8a8c0@thing1> From: Gabe Beged-Dov >The SAX filter registration interfaces don't allow multiple filters to be registered for the >same feature. I don't see how you can have more than one owner for all the registered >handlers without getting into severe trouble. This is not a bad thing as you probably want >something like MDSAX to handle the filter networks. It allows SAX(2) to be lean and mean. Frankly, I would love to see the design process for MDSAX2 as open as SAX. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Mar 25 20:54:30 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:10:32 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> Message-ID: On Thu, 25 Mar 1999, DuCharme, Robert wrote: > >I know, I know, this is anathema to what many of you feel is the > >essence of XML, and I agree to a point. > > It's not so much about feelings, as about contradicting the XML spec. Quite so. But there are still initiatives such as http://www.wapforum.org/docs/technical.htm http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf which attempts to define a 'compact binary representation of XML'. (If going down that route, I'd rather have a compact binary representation of whatever it was that I'm representing in XML, rather than of the XML that I might've used as a textual representation of the data... But then that really wouldn't be XML.) Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 20:59:32 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:32 2004 Subject: Whence XQL? In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> Message-ID: <000b01be7702$102babd0$0100007f@eps.inso.com> > I read most of those position papers as well. But the one by > Jonathan Robie, Texcel, Inc. Joe Lapp, webMethods, Inc. and David Schach, Microsoft > Corporation seemed the most complete. It even has a BNF for a > parser for XQL. I wouldn't bet my farm on that proposal. Folk at QL'98, both database and IR, had serious issues with it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 20:59:34 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca> Message-ID: <000c01be7702$1161e1e0$0100007f@eps.inso.com> > >By the same argument, > >

>x="1"> > >and > >

> >are different... > > David is right. It's too late now, because DOM level 1 wrote > CDATA sections into the spec so we're stuck with 'em - it's a > pity we didn't have the infoset back then. (I assume it won't > include them, right David?) -T. I think CDATA is a religious issue. Some people love them, some people hate them. Deal with it. They are currently in the InfoSet as a property of characters. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Mar 25 21:02:50 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <14074.38525.185285.772426@localhost.localdomain> References: <199903251626.LAA12547@hesketh.net> <14074.17776.784121.47587@localhost.localdomain> <199903251626.LAA12547@hesketh.net> Message-ID: <199903252102.QAA18653@hesketh.net> At 03:05 PM 3/25/99 -0500, you wrote: >Simon St.Laurent writes: > > > How about: > > > > public abstract void elementDecl (String name, > > String contentModel) > > throws SAXException; > > > > I like it, anyway. > >You know, it doesn't inhale (I was going to say 'suck', but I know >that my American cousins are still a little sensitive after the recent >impeachment trial). It's easy enough to parse the normalised content >model if you really need to: it would be all one string, with no >parameter entity references. Er... actually, I'd like it with PE's unprocessed. >Of course, people will rightly complain that the processor has already >done the work of parsing it. It's hard to know what to do here. Again, my plans for SAX involve keeping the data as uncooked as possible, partly for round trip reasons and partly because of the layered processing model I'd really like to demonstrate with standard parts. Maybe we need to add a 'cooked' or 'uncooked' option to tell the parser how we like our information. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Mar 25 21:05:33 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:32 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> Message-ID: <199903252105.QAA18751@hesketh.net> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote: >>I know, I know, this is anathema to what many of you feel is the >>essence of XML, and I agree to a point. > >It's not so much about feelings, as about contradicting the XML spec. > >[...] > >Applying XML concepts to a binary data format sounds interesting and >potentially useful, but it wouldn't be XML. One of these days I'd really love to stop talking about what is and isn't XML, though I know it's fun, and start talking about what we can do with XML and XML-like structures, whether they are SAX event flows, DOM trees, or binary formats that build on an XML foundation. We might even get some real work done - and it might even be fun. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 21:08:13 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <013901be7700$96dfb060$c8a8a8c0@thing1> References: <013901be7700$96dfb060$c8a8a8c0@thing1> Message-ID: * Bill la Forge | | Frankly, I would love to see the design process for MDSAX2 as open | as SAX. Then let's start it here once SAX2 is out the door. For me, that means when I've released the Python version of SAX2. If SAX2 doesn't provide all I want with regard to filters I'll be very interested in working on a design that does, for implementation in Python. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Thu Mar 25 21:10:07 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 In-Reply-To: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de> Message-ID: <000a01be7702$0e505c20$0100007f@eps.inso.com> > > Gavin summed it up quite well - the author used a CDATA Section and > > may have attached some semantic meaning to it (I know that several > > people disagree that CDATA sections can have semantic meaning; > > others think they can) so the DOM doesn't throw away that > > distinction, just in case. > > I'm having trouble imagining how a CDATA section can have > semantic meaning in all but the most abusive ways. (Hmmm, there's a CDATA > section. Fire up the pizza delivery DLL.) Could you give an example? Thanks. Not necessarily a semantic, but certainly an *intent*. The example given earlier was pretty good. You're writing a tutorial on HTML or some programming language, and as a convention, and as a convenience, you put all examples in CDATA sections. This makes it easy to edit *and* easy to extract your examples. Like Lauren, I am not saying that think CDATA sections are necessary or not, simply that some people really do want them. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 21:35:18 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:32 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <199903252105.QAA18751@hesketh.net> References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> <199903252105.QAA18751@hesketh.net> Message-ID: <14074.43833.458137.700586@localhost.localdomain> Simon St.Laurent writes: > One of these days I'd really love to stop talking about what is and isn't > XML, though I know it's fun, and start talking about what we can do with > XML and XML-like structures, whether they are SAX event flows, DOM trees, > or binary formats that build on an XML foundation. > > We might even get some real work done - and it might even be fun. Nah, we're getting work done already -- we need to goof off once in a while. Here's my translation of the above paragraph: One of these days I'd really love to stop talking about what is and isn't XML, though I know it's fun, and start talking about what we can do with structured documents, whether they're in text format (as XML, HTML, SGML, etc.), in binary format, in databases, or available through abstract interfaces like SAX, the DOM, and Groves. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Thu Mar 25 21:36:08 1999 From: elharo at metalab.unc.edu (Elliotte Harold - java FAQ) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <14074.17776.784121.47587@localhost.localdomain> Message-ID: I haven't been paying too much attention to SAX, but today I was sitting in my office waiting for students to drop by. They never do, except right before and after exams, so I was a little bored and started reading threads I'd normally filter, and I noticed something: > > public interface DTDDeclHandler > { > public final static int ATTRIBUTE_DEFAULTED = 1; > public final static int ATTRIBUTE_IMPLIED = 2; > public final static int ATTRIBUTE_REQUIRED = 3; > public final static int ATTRIBUTE_FIXED = 4; > How committed are you to using integer constants? I know this is common, but it tends to lend itself to bad code. Some people prefer a solution like this: public class AttributStatus { public final static AttributeStatus ATTRIBUTE_DEFAULTED = new AttributeStatus(); public final static AttributeStatus ATTRIBUTE_IMPLIED = new AttributeStatus(); public final static AttributeStatus ATTRIBUTE_FIXED = new AttributeStatus(); public final static AttributeStatus ATTRIBUTE_REQUIRED = new AttributeStatus(); private AttributeStatus() {} } This creates four menmonic constants you want and gives them a checkable type. New constants can't be created because of the private constructor. And there's no chance that anybody's going to write code like if (getAttributeStatus() == 1) { doSomething(); } Programmers are more or less forced to use the constants. What do you think? -- Elliotte Rusty Harold elharo@metalab.unc.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 25 21:47:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:32 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: References: <14074.17776.784121.47587@localhost.localdomain> Message-ID: <14074.44709.409452.525331@localhost.localdomain> Elliotte Harold - java FAQ writes: > How committed are you to using integer constants? I know this is common, > but it tends to lend itself to bad code. Some people prefer a solution > like this: > > public class AttributStatus { > > public final static AttributeStatus ATTRIBUTE_DEFAULTED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_IMPLIED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_FIXED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_REQUIRED = > new AttributeStatus(); > > private AttributeStatus() {} > > } Yes, I do this all the time in my own Java code (the lack of enum in Java is a serious design flaw, as I've been arguing for a few years now), but I'm strongly committed to keeping SAX as small and simple as possible. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 21:49:13 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:32 2004 Subject: Is there anyone working on a binary version of XML? References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> Message-ID: <36FAB727.905F4E2C@lig.net> Let me clarify. I want to create a new standard, very closely related to XML and tracking and dependant on it, but using "binary" data structures. What I really mean by "binary" is that instead of a stream of characters that have structured meaning after parsing and transformation to an internal datastructure, I want a data format that encodes an equivalent data structure directly. I have designed something that is directly usable in memory, as loaded, with a DOM interface, or SAX, etc., only much more efficiently than starting from XML. The design I have in mind was controlled by an optimization process with constraints of standard Java capabilities. I think a reasonable name for this project would be: bXML or XMLb (probably the latter). Thanks sdw "DuCharme, Robert" wrote: > >I know, I know, this is anathema to what many of you feel is the > >essence of XML, and I agree to a point. > > It's not so much about feelings, as about contradicting the XML spec. > > >From 1. Introduction (http://www.w3.org/TR/REC-xml#sec-intro): > > "XML documents are made up of storage units called entities, which > contain either parsed or unparsed data. Parsed data is made up of > characters, some of which form character data, and some of which form > markup." > > ("characters" there links to http://www.w3.org/TR/REC-xml#dt-character:) > > A parsed entity contains text, a sequence of characters, which may > represent markup or character data. A character is an atomic unit of > text as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are > tab, carriage return, line feed, and the legal graphic characters of > Unicode and ISO/IEC 10646." > > Applying XML concepts to a binary data format sounds interesting and > potentially useful, but it wouldn't be XML. > > Bob DuCharme www.snee.com/bob snee.com> see www.snee.com/bob/xmlann for "XML: > The Annotated Specification" from Prentice Hall. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Thu Mar 25 21:51:23 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:32 2004 Subject: Whence XQL? In-Reply-To: <000b01be7702$102babd0$0100007f@eps.inso.com> References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> Message-ID: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> At 02:19 PM 3/25/99 -0500, Gavin Thomas Nicol wrote: >I wouldn't bet my farm on that proposal. Folk at QL'98, both database >and IR, had serious issues with it. Frankly, I don't know of anything that has been proposed to the XML or web communities that hasn't found its critics. XQL has found both avid fans and strong critics. Since you make no specific technical claims here, it is hard to dismiss what you say with information, but perhaps I can make some broad statements that address what you are implying here. Both database and IR people made contact with me at QL'98, showing interest and appreciation, and we have been in active and enthusiastic correspondence ever since. XQL has been more widely implemented than any other XML query language (I just posted information on six implementations today), and it is closely related to XSL Patterns. The main criticism from database folks was that they wanted to see joins and transformations in XQL. Peter Fankhauser has proposed extensions to XQL for joins. Declarative transformations are, of course, very useful, but XSL can also be used for transformations. One of the big reasons for leaving joins and transformations out of the first version was to make implementation simple - which is why there are quite a few implementations of XQL. I suspect that there will be later versions of XQL that include at least joins; I'm less certain about declarative transformations, since XSL already exists and can do transformations, but I do really like declarative transformations. At least one IR person criticized XQL for doing too much, eg for having the parent/child relationship in addition to the ancestor/descendant relationship. This does, in fact, increase the complexity of implementation, but offers a distinction that I find important. The number of implementations of XQL shows that there's a fair amount of interest in it. People who have demonstrated it at trade shows send me email telling me how impressed people are - for instance, I have been getting email from Software AG, which is showing XQL at CeBIT this week and getting very enthusiastic responses. When I discuss XQL at trade shows, I get enthusiastic responses. So the fact that there are also critics doesn't bother me. If you want to implement a query language today, for reasonable effort, and you want to use a language that has been implemented in other software systems, I think XQL is a very good choice. There will be a W3C XML Query Language Activity, and it will develop its own query language, and nobody can say how similar or different it will be to any existing query language for XML. I'm sure there will be a lot of interesting and creative work done by the bright people who will be involved in that group - if you can afford to wait a year to implement a query language, then by all means wait for that language to be developed. Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stark at uplanet.com Thu Mar 25 21:58:07 1999 From: stark at uplanet.com (Peter Stark) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: Message-ID: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com> Not only "attempts to define". The "binary XML" defined by the WAP Forum is a format for tokenized XML. It's supported by cellular phones with WAP browsers, e.g. http://www.nokia.com/phones/7110/index.html. Element and attribute names are replaced by binary values to make parsing cheaper in the client. It does, however, not support all XML features. For example, XML namespaces are not supported. You can read more about WAP at: http://www.uplanet.com/pub/111398_WAP_V1whitepaper.pdf Peter Stark > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Dan Brickley > Sent: Thursday, March 25, 1999 12:54 PM > To: 'xml-dev@ic.ac.uk' > Subject: RE: Is there anyone working on a binary version of XML? > > > > On Thu, 25 Mar 1999, DuCharme, Robert wrote: > > > >I know, I know, this is anathema to what many of you feel is the > > >essence of XML, and I agree to a point. > > > > It's not so much about feelings, as about contradicting the XML spec. > > Quite so. But there are still initiatives such as > http://www.wapforum.org/docs/technical.htm http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf which attempts to define a 'compact binary representation of XML'. (If going down that route, I'd rather have a compact binary representation of whatever it was that I'm representing in XML, rather than of the XML that I might've used as a textual representation of the data... But then that really wouldn't be XML.) Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Mar 25 22:02:12 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <14074.43833.458137.700586@localhost.localdomain> References: <199903252105.QAA18751@hesketh.net> <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> <199903252105.QAA18751@hesketh.net> Message-ID: <199903252201.RAA19712@hesketh.net> At 04:35 PM 3/25/99 -0500, David Megginson wrote: >Simon St.Laurent writes: > > > One of these days I'd really love to stop talking about what is and isn't > > XML, though I know it's fun, and start talking about what we can do with > > XML and XML-like structures, whether they are SAX event flows, DOM trees, > > or binary formats that build on an XML foundation. > > > > We might even get some real work done - and it might even be fun. > >Nah, we're getting work done already -- we need to goof off once in a >while. > >Here's my translation of the above paragraph: > > One of these days I'd really love to stop talking about what is and > isn't XML, though I know it's fun, and start talking about what we > can do with structured documents, whether they're in text format (as > XML, HTML, SGML, etc.), in binary format, in databases, or available > through abstract interfaces like SAX, the DOM, and Groves. That translation's fine by me! I just don't want people getting shut down because their project "isn't XML". Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 25 22:03:03 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:33 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 Message-ID: <017c01be770b$f7f189e0$c8a8a8c0@thing1> From: Lars Marius Garshol >| Frankly, I would love to see the design process for MDSAX2 as open >| as SAX. > >Then let's start it here once SAX2 is out the door. For me, that >means when I've released the Python version of SAX2. If SAX2 doesn't >provide all I want with regard to filters I'll be very interested in >working on a design that does, for implementation in Python. After April would be best for me. Till then I'm pretty tied up. I see almost all of the MDSAX interfaces being replaced by SAX2. And I assure you, I plan to drop the current requirement of having a setParser method from MDSAX2--a bad design decision on my part is what caused it, and that created quite a few problems in turn. Another problem was in the incompleteness of the AttributeList api. Hard to make extensions to. And no way to add/update attributes in a filter because of the incompleteness of the api. Typically you check for the use of a known implementation and if it isn't being used, replace the whole attribute list. And that really gets messy if you are also trying to include extensions on the attributes! One new problem for MDSAX being introduced by SAX2 is when parser events are being routed between subfilters. These subfilters may all need to be aware of application events, in contrast to a filter stack where application events are handled by successive filters. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 22:04:36 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? References: <199903252105.QAA18751@hesketh.net> Message-ID: <36FABAAB.845C90DA@lig.net> "Simon St.Laurent" wrote: > At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote: > >>I know, I know, this is anathema to what many of you feel is the > >>essence of XML, and I agree to a point. > > > >It's not so much about feelings, as about contradicting the XML spec. > > > >[...] > > > >Applying XML concepts to a binary data format sounds interesting and > >potentially useful, but it wouldn't be XML. > > One of these days I'd really love to stop talking about what is and isn't > XML, though I know it's fun, and start talking about what we can do with > XML and XML-like structures, whether they are SAX event flows, DOM trees, > or binary formats that build on an XML foundation. > > We might even get some real work done - and it might even be fun. I agree with the sentiment Simon. I'm required (or am requiring myself) to get a lot of real work done very quickly in the next 6 months hence my focus... Semantically, I am talking about using XML. After parsing and creating a DOM tree or SAX events, you no longer have XML but a data structure semantically equivalent to an XML document. Another way to think about what I'm proposing is that it is a cache of the data structures produced from processing an XML document, cast in a openly documented data structure that is already flattened and ready for IO. In fact, this is how I arrived at this design after following a few other design constraints and observations. Of course from there it is a short stop to say that you can throw away the 'external' XML representation if you can recreate it from XMLb. My scheme makes parsing of XML a non-issue. If I only have that advantage within my closed system, so be it, converting to and from XML for external purposes is in fact what I intend to do. In my case, I'm architecting a high speed clustering system, primarily targeted at Linux/Unix and Java. In this kind of system of course you are splitting applications into many servers. Of course the communication between those nodes is really internal application communication, the equivalent of that DOM tree, so it makes sense to optimize it. Think of it this way, you'd seldom design a large app where every method needs to parse the XML text block passed to it to get a DOM tree (or SAX events) if the calling method has a DOM tree that it could just pass. sdw > Simon St.Laurent > XML: A Primer > Sharing Bandwidth / Cookies > http://www.simonstl.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu Mar 25 22:05:18 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:33 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <14074.17776.784121.47587@localhost.localdomain> References: <14074.17776.784121.47587@localhost.localdomain> Message-ID: * David Megginson | | I'm still shying away from reporting element-type declarations, at | least until someone shows me an easy and concise way of doing it (in | AElfred, I simply provided the content model as a fully-normalised | string). This is difficult in Java, mainly because of a gross deficiency in the language: the difficulties of representing general nested list structures in memory. Over-emphasis on objects has some ugly side-effects. I think this would be easier in C, even. (Arrays and unions should do it.) xmlproc uses Python lists and tuples to do this: and similarly easy solutions are easily imaginable in other scripting languages, as well as industrial-strength ones like Common Lisp. For Java I suppose the string solution is the most natural one. I don't think that approach will be chosen in the Python version, though. Also, if element declarations are included, I suppose notations should be, too. Shouldn't be very hard, and I think the benefits are great enough that both should be included. This should be enough to present a SAX 1.0-like view of DTDs, more or less without lexical information, and still be simple enough to warrant the name SAX. | public interface DTDDeclHandler | { | public final static int ATTRIBUTE_DEFAULTED = 1; | public final static int ATTRIBUTE_IMPLIED = 2; | public final static int ATTRIBUTE_REQUIRED = 3; | public final static int ATTRIBUTE_FIXED = 4; | | public abstract void attributeDecl (String element, | String name, | String type, Here we need some convention for representing enumerations. "ENUMERATION" will probably do. :) | public abstract void externalEntityDecl (String name, | boolean isParameterEntity, | String publicId, | String systemId) | throws SAXException; I think it would be more natural to have separate callbacks for parameter entities. It makes the interface grow, but I think it is more intuitive to learn (the first look at the javadoc shows how it works, you don't have to study the parameters in detail to figure it out) and also more natural to use. | public abstract void internalEntityDecl (String name, | boolean isParameterEntity, | String value) | throws SAXException; Should value be named 'replacementText', just to make it clearer what it is? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 22:10:08 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:33 2004 Subject: Whence XQL? References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> Message-ID: <36FABBFE.9CB29BFA@lig.net> Could you please recommend an Open Source project that I could use as a base and contribute to? Java preferably, something with indexing of multiple documents would be great. Full-tag, full-text would be very interesting. I may be able to have one or more people work on this if it could have basic functionality shortly. I have some of my own ideas, of course.... sdw Jonathan Robie wrote: > At 02:19 PM 3/25/99 -0500, Gavin Thomas Nicol wrote: > > >I wouldn't bet my farm on that proposal. Folk at QL'98, both database > >and IR, had serious issues with it. > > Frankly, I don't know of anything that has been proposed to the XML or web > communities that hasn't found its critics. XQL has found both avid fans and > strong critics. Since you make no specific technical claims here, it is > hard to dismiss what you say with information, but perhaps I can make some > broad statements that address what you are implying here. > > Both database and IR people made contact with me at QL'98, showing interest > and appreciation, and we have been in active and enthusiastic > correspondence ever since. > > XQL has been more widely implemented than any other XML query language (I > just posted information on six implementations today), and it is closely > related to XSL Patterns. > > The main criticism from database folks was that they wanted to see joins > and transformations in XQL. Peter Fankhauser has proposed extensions to XQL > for joins. Declarative transformations are, of course, very useful, but XSL > can also be used for transformations. One of the big reasons for leaving > joins and transformations out of the first version was to make > implementation simple - which is why there are quite a few implementations > of XQL. I suspect that there will be later versions of XQL that include at > least joins; I'm less certain about declarative transformations, since XSL > already exists and can do transformations, but I do really like declarative > transformations. > > At least one IR person criticized XQL for doing too much, eg for having the > parent/child relationship in addition to the ancestor/descendant > relationship. This does, in fact, increase the complexity of > implementation, but offers a distinction that I find important. > > The number of implementations of XQL shows that there's a fair amount of > interest in it. People who have demonstrated it at trade shows send me > email telling me how impressed people are - for instance, I have been > getting email from Software AG, which is showing XQL at CeBIT this week and > getting very enthusiastic responses. When I discuss XQL at trade shows, I > get enthusiastic responses. So the fact that there are also critics doesn't > bother me. > > If you want to implement a query language today, for reasonable effort, and > you want to use a language that has been implemented in other software > systems, I think XQL is a very good choice. There will be a W3C XML Query > Language Activity, and it will develop its own query language, and nobody > can say how similar or different it will be to any existing query language > for XML. I'm sure there will be a lot of interesting and creative work done > by the bright people who will be involved in that group - if you can afford > to wait a year to implement a query language, then by all means wait for > that language to be developed. > > Jonathan > > jonathan@texcel.no > Texcel Research > http://www.texcel.no > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Thu Mar 25 22:22:32 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:10:33 2004 Subject: XML and (K)Office In-Reply-To: <14074.8435.653789.348824@localhost.localdomain> References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> Message-ID: <4.1.19990326091444.00bac620@steptwo.com.au> At 21:55 25/03/1999 , David Megginson wrote: | [David] | | > > Anyway, let's get this right -- I think that it's healthy for | > > both Gnumeric and the KOffice Spreadsheet program both to exist, | > > but there is no excuse for them to use entirely incompatible | > > formats. As a matter of fact, if we could convince KDE and Gnome | > > to use compatible XML formats for lots of things (like interface | > > construction), the media's predictions of a Linux fracture will | > > be proven to be hot air. | | [Matt] | | > Although I agree to an extent, if they have different feature sets | > it's pretty unlikely that you're going to get an entirely perfect | > agreement on a spreadsheet DTD. | | I disagree *very* strongly -- with Namespaces, we can design a common | format for the 90% of functionality that the two spreadsheets actually | have in common (text cells, data cells, basic formulas, general | formatting information [font, alignment, colour, size], etc.) and | then allow each to provide extended information | unambiguously-delimited through the use of separate namespaces. | | The more material in the common spec, the better interoperability. | Linux needs to set an example here. Why do namespaces help us here? It: * Breaks validation. We are no longer able to ensure that the files we are reading/creating are correct and useful. * Still has the variations between applications, so that a reader of a given format still needs to know 100% about what is that format. Without the rigour of a DTD, we've got nothing. Particularly since this data may well live long, and is not some transient "sent over the web" data. How will future users make sense of the format without a DTD? | > However, that's the beauty of XML. Writing a converter from one | > format to another is trivial in the extreme, so it's not a huge | > issue in my (humble) opinion. | | For n XML-based formats, we need (n * (n - 1)) converters. If there | are only two different XML-based spreadsheet formats, then we need | only two converters: | | a => b | b => a | | If there are three XML-based different formats, then we need six | converters: | | a => b | a => c | b => a | b => c | c => a | c => b Again, having namespaces doesn't solve this problem. Regardless of what you call it, if the formats are different, they're different. But anyway, this reasoning isn't necessarily true. What about: a => x b => x c => x x => a x => b x => c That is, an intermediate DTD that captures all the usefully sharable data. For a successful example of this, see the Rainbow DTDs for word documents. This greatly reduces the number of conversions as the number of formats increases. Cheers, James ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Thu Mar 25 22:29:22 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:10:33 2004 Subject: XML and (K)Office In-Reply-To: <199903251448.JAA00778@ruby.ora.com> References: <199903242256.XAA04448@sonne.darmstadt.gmd.de> Message-ID: <4.1.19990326092458.00babc10@steptwo.com.au> At 00:48 26/03/1999 , Chris Maden wrote: | [Ingo Macherius] | > David Megginson wrote at 24 Mar 99, 17:02: | > | > > There's also a hot rumour [3] that Microsoft has assigned 37 | > > programmers to work on a Linux port of MS Office. | | Soem quick research on slashdot shows the rumor's evolution. The | first sighting appears to be on ZDnet; they reported that Simson | Garfinkle, a _Boston Globe_ columnist and technology writer, mentioned | on a radio show that he was in correspondence with some of the | developers. But even if that's true, I can think of a number of | reasons why Microsoft might be doing a port internally with no | intentions whatsoever of releasing it. The ZDnet article notes that | Office relies heavily on MS's undocumented Win32 API calls, and just | porting the app to the standard API calls which could then be handled | in emulation on Linux would be a major chore. Some URLs: Isn't this the standard strategy of MS at work? That is, they leak "rumours" that they are about to release some wonderful new technology. Everyone, on the basis of this, holds back on purchasing or obtaining competing (currently available) technology. On the understanding that MS will be releasing something soon, which will become the defacto standard. Also known as "vapourware". Just look at their attempts to derail NDS (Novell Directory Services) using the same strategies. Just some paranoia in the morning, James ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Mar 25 22:30:56 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> >Let me clarify. I want to create a new standard, very closely related to XML and tracking and >dependant on it, but using "binary" data structures. Do you mean like MIME? MIME itself isn't related to XML but does deal with binary data in a standard fashion, and has the advantage of very widespread implementation. It is possible to make MIME work nicely with XML e.g. XMTP (see http://jabr.ne.mediaone.net/documents/xmtp.htm ) What I really mean by "binary" is that >instead of a stream of characters that have structured meaning after parsing and >transformation to an internal datastructure, I want a data format that encodes an equivalent >data structure directly. I have designed something that is directly usable in memory, as >loaded, with a DOM interface, or SAX, etc., only much more efficiently than starting from >XML. The design I have in mind was controlled by an optimization process with constraints of >standard Java capabilities. One option would be to define standard interfaces for MIME data. Using property sets and groves it might be possible to define a generic DOM for MIME. Specific property sets can be developed for arbitrary binary notation types and such groves would be accessable via interfaces ala the DOM. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sblackbu at erols.com Thu Mar 25 22:48:18 1999 From: sblackbu at erols.com (Samuel R. Blackburn) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <002801be7711$6f8f3c40$0100a8c0@sammy> XML in its current form cannot handle "binary" data at all. At best, you would have to convert non-text data to text. This is usually done via base64. You could create your own version of XML that could easily handle non-text data. All you need do is add one attribute to any XML element that provides the length (in bytes) of the non-text data. For example: GIF89a[4090 non-text bytes] The "bin:length" attribute could tell your parser to stop parsing and store the 4096 bytes following the closing > of the element. After the 4096 bytes have been stored, start parsing again. The down side of this approach is: 1) bin:length would have to be agreed on by all binary parsers out there 2) binary XML files cannot be parsed by non-binary aware parsers (in other words, every parser in the world today) HTH, Sam -----Original Message----- From: Stephen D. Williams To: xml-dev@ic.ac.uk Date: Thursday, March 25, 1999 2:04 PM Subject: Is there anyone working on a binary version of XML? >I know, I know, this is anathema to what many of you feel is the essence of >XML, and I agree to a point. >I have come to feel however that there is room for a "works-as-if" binary >analogue to text based XML. Something that is totally subservient to the >standard and has exactly equivalent features, but that is highly efficient >for processing at all levels and easily converted to and from text based >XML. > >In using XML in real-world application work and designing future >infrastructure that is highly scalable and efficient while making use of >XML, I have come to the conclusion that I need a standard way to deal with >an XML analogue that is binary. There are a multitude of performance >problems that this solves, not only in parsing and exporting, but processing >of related data inside applications. > >Before I make all the details and ideas public, I would like to know if >there is any serious precedent directly dealing with XML. > >My design has highly efficient Java processing in mind, but is not specific >to any particular language. >Compression is a secondary, but associated issue. > >Thanks >sdw >-- >OptimaLogic - Finding Optimal Solutions >Web/Crypto/OO/Unix/Comm/Video/DBMS >sdw@lig.net Stephen D. Williams Senior Consultant/Architect >http://sdw.st >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax >5Jan1999 > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 23:00:11 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <36FAC7C1.63406FAD@lig.net> "Simon St.Laurent" wrote: > At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote: > >>I know, I know, this is anathema to what many of you feel is the > >>essence of XML, and I agree to a point. > > > >It's not so much about feelings, as about contradicting the XML spec. > > > >[...] > > > >Applying XML concepts to a binary data format sounds interesting and > >potentially useful, but it wouldn't be XML. > > One of these days I'd really love to stop talking about what is and isn't > XML, though I know it's fun, and start talking about what we can do with > XML and XML-like structures, whether they are SAX event flows, DOM trees, > or binary formats that build on an XML foundation. > > We might even get some real work done - and it might even be fun. I agree with the sentiment Simon. I'm required (or am requiring myself) to get a lot of real work done very quickly in the next 6 months hence my focus... Semantically, I am talking about using XML. After parsing and creating a DOM tree or SAX events, you no longer have XML but a data structure semantically equivalent to an XML document. Another way to think about what I'm proposing is that it is a cache of the data structures produced from processing an XML document, cast in a openly documented data structure that is already flattened and ready for IO. In fact, this is how I arrived at this design after following a few other design constraints and observations. Of course from there it is a short stop to say that you can throw away the 'external' XML representation if you can recreate it from XMLb. My scheme makes parsing of XML a non-issue. If I only have that advantage within my closed system, so be it, converting to and from XML for external purposes is in fact what I intend to do. In my case, I'm architecting a high speed clustering system, primarily targeted at Linux/Unix and Java. In this kind of system of course you are splitting applications into many servers. Of course the communication between those nodes is really internal application communication, the equivalent of that DOM tree, so it makes sense to optimize it. Think of it this way, you'd seldom design a large app where every method needs to parse the XML text block passed to it to get a DOM tree (or SAX events) if the calling method has a DOM tree that it could just pass. sdw > Simon St.Laurent > XML: A Primer > Sharing Bandwidth / Cookies > http://www.simonstl.com -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 23:06:57 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:33 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> Message-ID: <36FAC962.D9F47DB6@lig.net> No, not really a solution to the problem set I'm trying to solve. Keep reading... I'm not trying to create a new way to recognize XML, but a more efficient way to do all kinds of computer processing and communication with it. Innordinate amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces between programs and other programs, databases, file systems, networks, servers, etc. XML is a good general solution, but some situations require optimization which is what I'm working on. sdw Jonathan Borden wrote: > >Let me clarify. I want to create a new standard, very closely related to > XML and tracking and > >dependant on it, but using "binary" data structures. > > Do you mean like MIME? MIME itself isn't related to XML but does deal > with binary data in a standard fashion, and has the advantage of very > widespread implementation. It is possible to make MIME work nicely with XML > e.g. XMTP (see http://jabr.ne.mediaone.net/documents/xmtp.htm ) > > What I really mean by "binary" is that > >instead of a stream of characters that have structured meaning after > parsing and > >transformation to an internal datastructure, I want a data format that > encodes an equivalent > >data structure directly. I have designed something that is directly usable > in memory, as > >loaded, with a DOM interface, or SAX, etc., only much more efficiently than > starting from > >XML. The design I have in mind was controlled by an optimization process > with constraints of > >standard Java capabilities. > > One option would be to define standard interfaces for MIME data. Using > property sets and groves it might be possible to define a generic DOM for > MIME. Specific property sets can be developed for arbitrary binary notation > types and such groves would be accessable via interfaces ala the DOM. > > Jonathan Borden > http://jabr.ne.mediaone.net -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 23:08:03 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com> Message-ID: <36FAC986.7E2A7AF8@lig.net> I'm already taking a look at it, but it doesn't completely address what I'm getting at. I may be able to branch from it. sdw Peter Stark wrote: > Not only "attempts to define". > > The "binary XML" defined by the WAP Forum is a format for tokenized XML. > It's supported by cellular phones with WAP browsers, e.g. > http://www.nokia.com/phones/7110/index.html. Element and attribute names are > replaced by binary values to make parsing cheaper in the client. It does, > however, not support all XML features. For example, XML namespaces are not > supported. > > You can read more about WAP at: > http://www.uplanet.com/pub/111398_WAP_V1whitepaper.pdf > > Peter Stark > > > -----Original Message----- > > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > > Dan Brickley > > Sent: Thursday, March 25, 1999 12:54 PM > > To: 'xml-dev@ic.ac.uk' > > Subject: RE: Is there anyone working on a binary version of XML? > > > > > > > > On Thu, 25 Mar 1999, DuCharme, Robert wrote: > > > > > >I know, I know, this is anathema to what many of you feel is the > > > >essence of XML, and I agree to a point. > > > > > > It's not so much about feelings, as about contradicting the XML spec. > > > > Quite so. But there are still initiatives such as > > > http://www.wapforum.org/docs/technical.htm > http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf > > which attempts to define a 'compact binary representation of XML'. > (If going down that route, I'd rather have a compact binary > representation of whatever it was that I'm representing in XML, rather > than of the XML that I might've used as a textual representation of the > data... But then that really wouldn't be XML.) > > Dan > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Thu Mar 25 23:12:54 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <002801be7711$6f8f3c40$0100a8c0@sammy> Message-ID: <36FACA9B.4D7CD9B2@lig.net> This is not what I meant. XML has mechanisms to store binary data as characters using all the standard methods. What I'm talking about is using data that is structured in a directly addressable way (think pointers, arrays, indexes, offsets) to represent the structure and content of an XML tree. My actual proposal is a bit more complicated than that because I want other types of optimizations for in-memory processing, but that is one of the roots of the idea. In other words, after loading the tree would be directly addressable (SAX or DOM) without any parsing (or very limited steps). A typical server might in fact support both XML and XMLb queries and responses. sdw "Samuel R. Blackburn" wrote: > XML in its current form cannot handle "binary" data at all. > At best, you would have to convert non-text data to text. > This is usually done via base64. > > You could create your own version of XML that could easily > handle non-text data. All you need do is add one attribute to > any XML element that provides the length (in bytes) of the > non-text data. For example: > > GIF89a[4090 non-text bytes] > > The "bin:length" attribute could tell your parser to stop parsing > and store the 4096 bytes following the closing > of the element. > After the 4096 bytes have been stored, start parsing again. > > The down side of this approach is: > > 1) bin:length would have to be agreed on by all binary parsers out there > 2) binary XML files cannot be parsed by non-binary aware parsers > (in other words, every parser in the world today) > > HTH, > > Sam > > -----Original Message----- > From: Stephen D. Williams > To: xml-dev@ic.ac.uk > Date: Thursday, March 25, 1999 2:04 PM > Subject: Is there anyone working on a binary version of XML? > > >I know, I know, this is anathema to what many of you feel is the essence of > >XML, and I agree to a point. > >I have come to feel however that there is room for a "works-as-if" binary > >analogue to text based XML. Something that is totally subservient to the > >standard and has exactly equivalent features, but that is highly efficient > >for processing at all levels and easily converted to and from text based > >XML. > > > >In using XML in real-world application work and designing future > >infrastructure that is highly scalable and efficient while making use of > >XML, I have come to the conclusion that I need a standard way to deal with > >an XML analogue that is binary. There are a multitude of performance > >problems that this solves, not only in parsing and exporting, but > processing > >of related data inside applications. > > > >Before I make all the details and ideas public, I would like to know if > >there is any serious precedent directly dealing with XML. > > > >My design has highly efficient Java processing in mind, but is not specific > >to any particular language. > >Compression is a secondary, but associated issue. > > > >Thanks > >sdw > >-- > >OptimaLogic - Finding Optimal Solutions > >Web/Crypto/OO/Unix/Comm/Video/DBMS > >sdw@lig.net Stephen D. Williams Senior Consultant/Architect > >http://sdw.st > >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax > >5Jan1999 > > > > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >(un)subscribe xml-dev > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > >subscribe xml-dev-digest > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Mar 25 23:41:36 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> Message-ID: <36FAC72C.2B911970@prescod.net> "Stephen D. Williams" wrote: > I'm not trying to create a new way to recognize XML, but a more efficient way to > do all kinds of computer processing and communication with it. Innordinate > amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces > between programs and other programs, databases, file systems, networks, servers, > etc. XML is a good general solution, but some situations require optimization > which is what I'm working on. I can see many ways that a typical XML document could be optimized for size if XML compatibility was not a concern. Call it "compressed ML." I am not clear, however, why CompressedML would need to be binary. There are many languages where working with binary data is more expensive than working with text. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Mar 26 00:05:45 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <002101be771b$583233e0$0b2e249b@fileroom.Synapse> Stephen D. Williams wrote: > >I'm not trying to create a new way to recognize XML, but a more efficient way to >do all kinds of computer processing and communication with it. Innordinate >amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces >between programs and other programs, databases, file systems, networks, servers, >etc. XML is a good general solution, but some situations require optimization >which is what I'm working on. > Are you talking about using XML which is text or binary data which isn't? XML itself isn't an interface, rather can be used as a data format to develop interfaces. The DOM is an interface onto XML documents which is modelled after ... hmmm .... what the XML property set *would* generate if it existed. This is the XML 'grove'. If you are talking about generating an API or interface onto binary data which is similar to the DOM, I suggest that the grove representation would be the most reasonable. The binary data format's property set (if such exists) would be used to generate a DOM-like interface onto the binary data. This deals with the interface issues on binary data formats. MIME deals with standardized serialization issues on binary (and other) data formats. Perhaps you are proposing a binary data format which is in some fashion similar to XML? Such a data format would have its own property set, grove and set of interfaces. It would not be XML. I am suggesting that the use of property sets, the grove formalism and generated interfaces would be the most logical mechanism to develop a system designed to be similar to XML yet deal with binary data formats. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Fri Mar 26 00:06:37 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <61DAD58E8F4ED211AC8400A0C9B468731AAC87@THOR> I guess the key thing is what you are trying to communicate. If you are primarily dealing with textual information, then the only transform that would seem to make sense is compression or encryption (depending if you were trying to reduce required bandwidth/diskspace at the expense of processing or trying to hide information). The event based parsers (such as expat) can chew through large files at blinding speed. The relative slowness of the DOM based parsers is primarily due to the expense of string allocation and that would not be eliminated if you simple changed the media. Neither of those requires anything new from the XML world. If you were trying to communicate something that a textual representation cannot be comprehendable (say a JPEG image), then trying to use XML at all is just a poor decision. The one domain that a binary XML seems useful is when the bulk of the content is numeric (especially floating point). In those cases, you would like to be able (in some circumstances) to transmit floating point numbers without the loss of precision that comes with a conversion to and from text. For this to work, you would need a persistance framework that took typed information and depending on the archive object that you passed would create either a textual XML file or a binary analogue. My approach to storage was to expand the Microsoft Property Storage mechanism by generating CRC's for the tag and attribute names to generate the Property Identifer (32-bit int) and the representing the content in the appropriate variant (numerics as IEEE format, text in Unicode). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Fri Mar 26 00:11:45 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsager) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> Message-ID: <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> There are two places where use of XML can be optimized - one when it is transfered on the wire and second when the program tries to interpret the XML data using SAX,DOM etc. My interpretation is that Stephen is talking about optimizing the process of interpreting the XML document at the client. But I'm still not sure, what will the in-memory presentation of bXML buy us above DOM. Thanks, Shekhar Kshirsagar ----- Original Message ----- From: Paul Prescod To: Sent: Thursday, March 25, 1999 6:30 PM Subject: Re: Is there anyone working on a binary version of XML? > "Stephen D. Williams" wrote: > > I'm not trying to create a new way to recognize XML, but a more efficient way to > > do all kinds of computer processing and communication with it. Innordinate > > amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces > > between programs and other programs, databases, file systems, networks, servers, > > etc. XML is a good general solution, but some situations require optimization > > which is what I'm working on. > > I can see many ways that a typical XML document could be optimized for > size if XML compatibility was not a concern. Call it "compressed ML." I am > not clear, however, why CompressedML would need to be binary. There are > many languages where working with binary data is more expensive than > working with text. > > -- > Paul Prescod - ISOGEN Consulting Engineer speaking for only himself > http://itrc.uwaterloo.ca/~papresco > > "Perpetually obsolescing and thus losing all data and programs every 10 > years (the current pattern) is no way to run an information economy or > a civilization." - Stewart Brand, founder of the Whole Earth Catalog > http://www.wired.com/news/news/culture/story/10124.html > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 26 01:33:41 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> <36FAB727.905F4E2C@lig.net> Message-ID: <006f01be7717$4c809180$0300000a@cygnus.uwa.edu.au> Have a look at http://www.wapforum.org/ which includes a draft document specifying a compact binary representation of XML documents. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Mar 26 01:57:36 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:34 2004 Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave) Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1DE@RED-MSG-08> Information on Microsoft's support of various XML technologies is available at http://www.microsoft.com/xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 02:16:37 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:34 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> Message-ID: <36FAEE03.E435EB6D@lig.net> Imagine that you have all the features of XML: structure, flexibility, common format for interchange, but that you perform zero processing steps to import or export the 'document' from a program. (Actually, I'm thinking this would be done in chunks, but essentially very few reads and writes.) Also imagine that when taking an XML 'document' into a program you could search, modify, or copy the object without generating thousands of object creates and deletes/garbage collection hits. I call this last problem a 'malloc storm' and it appears to be one of the worst problems with a lot of large Java systems. (I've experienced this problem in C++ programs for years and Java for the last year.) Among other things, I'm directly addressing this issue. Then imagine you can write or communicate the object to other systems simply with IO operations with no processing involved. Then imagine that the IO is async and very cheap and that you are processing thousands of transactions per second, most of which generate fundamentally little processing steps. I am and will be, necessarily, revealing some very hard won lessons in optimizing very large systems as part of my design for this. I just feel strongly that this step is inevitable at some point and I want to have the most useful form of it become standard. As I mentioned, it should work particularly well with Java's available capabilities, but should be easily usable by C/C++, etc. There are several things that could be optimized here. CPU in application processing, CPU in overhead, CPU in preparation for IO, size of data in memory, size of data in storage, size of data in transit, etc. This method would primary allow a drastic decrease in CPU for most situations and a slight decrease in storage with an easy path to more comprehensive compression levels. I'm going to be studying the existing binary effort and then releasing a few notes on details of what I'm thinking. I'll try to get a Java prototype working soon. It appears that the best path is to use SAX to generate bXML that will have either a SAX or DOM interface. Note that the payload data in bXML would still be the same character data that would be in character areas of a normal XML document (possibly without canonicalizing translations). When mentioning 'binary', I simply meant that the structure would be represented by 'binary' data structures of where to find elements, etc. In fact it's possible to do this all in ascii/Unicode if one desired. The point is that bXML is not designed to be editable by a text editor since it has more of a 'structured' layout, sort of like a filesystem. One other subject that I haven't mentioned, but need for another architecture that I designed a while ago is a mechanism for 'parallel inheritance' overlay tree processing. Has anyone else worked on this? The idea is to have one or more base trees and work with a delta tree which represents changes from the underlying trees. This last part is a basic data structure for a rule engine and metadata application environment I designed last year. I don't mean to be distracting from external XML issues and standards, however XML is close to being perfect for using for protocols, API's, message systems, RPC, etc. vs. DCOM, Corba (hopefully this can be resolved), etc. Web-XML was a good example of this. It turns out that for message passing systems in a cluster, you really need to externalize the kinds of optimizations I'm talking about, vs. something normally internal to a particular SAX/DOM parser. sdw Shekhar Kshirsager wrote: > There are two places where use of XML can be optimized - one when it is > transfered on the wire and second when the program tries to interpret the > XML data using SAX,DOM etc. > My interpretation is that Stephen is talking about optimizing the process of > interpreting the XML document at the client. > But I'm still not sure, what will the in-memory presentation of bXML buy us > above DOM. > > Thanks, > Shekhar Kshirsagar > > ----- Original Message ----- > From: Paul Prescod > To: > Sent: Thursday, March 25, 1999 6:30 PM > Subject: Re: Is there anyone working on a binary version of XML? > > > "Stephen D. Williams" wrote: > > > I'm not trying to create a new way to recognize XML, but a more > efficient way to > > > do all kinds of computer processing and communication with it. > Innordinate > > > amounts of time, money, effort, CPU, and bandwidth are spent at the > interfaces > > > between programs and other programs, databases, file systems, networks, > servers, > > > etc. XML is a good general solution, but some situations require > optimization > > > which is what I'm working on. > > > > I can see many ways that a typical XML document could be optimized for > > size if XML compatibility was not a concern. Call it "compressed ML." I am > > not clear, however, why CompressedML would need to be binary. There are > > many languages where working with binary data is more expensive than > > working with text. > > > > -- > > Paul Prescod - ISOGEN Consulting Engineer speaking for only himself > > http://itrc.uwaterloo.ca/~papresco > > > > "Perpetually obsolescing and thus losing all data and programs every 10 > > years (the current pattern) is no way to run an information economy or > > a civilization." - Stewart Brand, founder of the Whole Earth Catalog > > http://www.wired.com/news/news/culture/story/10124.html > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Fri Mar 26 02:20:39 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:34 2004 Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave) In-Reply-To: <30649320C177D111ADEC00A024E9F297169FC0@exchange-server.deg a.com> Message-ID: <3.0.3.32.19990325212134.00c48100@pop.mindspring.com> At 11:01 AM 3/25/99 -0800, Ed Howland wrote: >BTW, how do you get at the XQL part of IE5? I never saw that in MS's >writeup. Or is it just extensions to their XSL? See the following URL for details: http://www.microsoft.com/workshop/xml/xmldom/scriptref/XMLDOMNode_selectNode s.asp Their documentation calls it "XSL Patterns", but it supports the XQL from the paper I wrote jointly with webMethods and Microsoft. Here's their documentation of the patterns they support, with a reference to the XQL paper: http://www.microsoft.com/workshop/xml/xsl/reference/XSLPatternSyntax.asp Hope this helps! Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 02:29:13 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:34 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001501be532c$ffeed060$d3228018@jabr.ne.mediaone.net> <370CF210.759B26AB@prescod.net> Message-ID: <36FAF0FA.123C1F40@lig.net> I really believe, at the moment, that my ultimate database would have XML based tables and allow relational (possibly even SQL) queries along with an XQL/XML-QL structured queries. Normalization could take several forms and allow non-normalized or threshold normalized forms. Of course the SQL-XML view on things would not have full XML structuring capabilities, it would provide a great bridge and in fact solve quite a few annoying problems that RDBMS's and OODBMS's haven't really solved satisfactorily: schema migration and flexible schema's. I have some ideas, but I'm going through some references to see what I'm missing right now. sdw Paul Prescod wrote: > "Borden, Jonathan" wrote: > > > > > 3. Therefore we should pretend that relational databases are really DOM > > > trees. > > > > no. if the data is tabular then use a recordset. in the specific cases when > > 1) we are storing data which is naturally hierarchical. 2) when the data > > needs to interface with systems which for other reasons employ DOM > > interfaces > > Okay. We can probably all agree with this. If you have software that is > expecting a DOM and you need to connect it to data that is not XML, you > need to build a DOM interface. This is a different point of view from > those who say: "let's build new client software using only the DOM served > by data with only a DOM interface. The fact that the DOM is standardized > will just make all of my interoperability problems go away." No way. If > your client software and your server software had an impedence mismatch, > slapping a DOM interface on both sides makes it *worse* not better. > > > e.g. my XSL processor us built on a DOM interface and I wish to > > query the database using XQL (which happens to be built into my XSL > > processor in this example), it is more convenient to interface to the data > > using DOM interfaces than it is using recordsets (i.e. tabular data). > > It's more convenient but it's probably going to run as slow as hell. > Nobody implements SQL or OQL on top of an industry-standard interface. > They put it right in the core engine of their database. > > > Arguably, when using an ODBMS this example would be more straightforward > > (but you picked RDBMS). The problem is that there is no standard, language > > independent interface onto ODBMS's. > > ********** Yes there is! ************* > > It isn't as widely hyped as XML/DOM. I haven't written a book about it > (and hardly has anyone else). But the standards *do* exist. Check > http://www.odmg.org. There are well defined APIs, bindings in a few > languages, a solid object model and a query language. It's all in there. > > My fear it that these technologies will get lost in the XML hype. > > > The DOM, while not the perfect interface > > *is* standard, and this is the big utility. > > The DOM is a standard for accessing XML, HTML and CSS information. It > isn't for modelling arbitrary business objects. It wasn't designed for > that and it isn't good at that. > > > For example, I get to say (using 'extended DOM'): > > > > NodeList anotherSet = airplanes.selectNodes("airplane[@color='red' and > > .//screw/thread/@pitch = 64]"); > > > > to select all red airplanes with screws having a pitch=64... > > The DOM is doing essentially nothing here. This imaginery XML query > language is doing all of the work. But even the XML query language is > going to make solving your problem harder than OQL would. For instance OQL > can be statically type checked. XQL cannot, in general, for many subtle > reasons. OQL can handle mathematical range constraints. OQL has a concept > of a "stored query" that allows some level of abstraction. OQL has "local > variables" also for abstraction. > > I don't completely follow your examples: > > > XMOP for example (http://jabr.ne.mediaone.net/documents/xmop.htm) is a way > > to serialize arbitrary COM objects using their typeinfo metadata. XMOP is a > > layer that can persist objects into either a) a stream (serialization) b) > > direct-to-DOM. When I attempted to design a direct-to-Recordset persistence > > interface on XMOP I found that I had to essentially develop a > > DOM<->Relational mapping. This is because arbitrary objects can be modelled > > in a hierarchical fashion (e.g. serialized to XML). > > This seems like a serialization problem. We all agree that XML is great > for serialization. If your only goal was to get the data into a "database > of some kind" then an OO database would have been easier than an XML > database. > > > In another example, using the medical imaging DICOM protocol (a complex > > property based protocol) I have developed a mapping to the Microsoft > > PropertySet format (used with Index Server). This mapping is not clean (at > > all given the inability to represent certain DICOM structures as > > PROPVARIANTs). This causes similar problems in mapping the protocol to a > > relational database (the workaround is to use binary data). Using XML and > > the DOM was a piece of cake to solve this difficult problem. > > I'm not at all clear on how the DOM solved this impedence mismatch. > > -- > Paul Prescod - ISOGEN Consulting Engineer speaking for only himself > http://itrc.uwaterloo.ca/~papresco > > "Remember, Ginger Rogers did everything that Fred Astaire did, > but she did it backwards and in high heels." > --Faith Whittlesey > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Mar 26 02:32:45 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:34 2004 Subject: Whence XQL? In-Reply-To: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>; from Jonathan Robie on Thu, Mar 25, 1999 at 04:52:17PM -0500 References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> Message-ID: <19990326133124.B7318@io.mds.rmit.edu.au> On Thu, Mar 25, 1999 at 04:52:17PM -0500, Jonathan Robie wrote: > At least one IR person criticized XQL for doing too much, eg for > having the parent/child relationship in addition to the > ancestor/descendant relationship. This does, in fact, increase the > complexity of implementation, but offers a distinction that I find > important. This is particularly so given the broadening of XML's focus from documents to documents and data. > The number of implementations of XQL shows that there's a fair > amount of interest in it. People who have demonstrated it at trade > shows send me email telling me how impressed people are - for > instance, I have been getting email from Software AG, which is > showing XQL at CeBIT this week and getting very enthusiastic > responses. When I discuss XQL at trade shows, I get enthusiastic > responses. So the fact that there are also critics doesn't bother > me. I could be disingenuous ( :-) ) and suggest that the attachment to Microsoft has more than a little to do with its success to date, but I certainly don't want to disparage the effort in its own right. It offers a good compromise between expressivity and simplicity, which is a far more practicable goal than completeness. I am concerned (am I right on this?) at the lack of proximity operators. But that's just an implementor's perspective, looking at doing things we already support. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Fri Mar 26 03:08:01 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:35 2004 Subject: Whence XQL? In-Reply-To: <19990326133124.B7318@io.mds.rmit.edu.au> References: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> Message-ID: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com> At 01:31 PM 3/26/99 +1100, Marcelo Cantos wrote: >I could be disingenuous ( :-) ) and suggest that the attachment to >Microsoft has more than a little to do with its success to date, but I >certainly don't want to disparage the effort in its own right. It >offers a good compromise between expressivity and simplicity, which is >a far more practicable goal than completeness. Well, Microsoft was one of the first companies I got interested in XQL ;-> >I am concerned (am I right on this?) at the lack of proximity >operators. But that's just an implementor's perspective, looking at >doing things we already support. Cool, you work on SIM? (Does that make you a SIMian?) I really enjoyed talking to Timothy Arnold-Moore at Markup Technologies '98 - Makoto Murata-san and I managed to snag him after his presentation and grill him with questions for a while. I've gone back and forth on proximity operators. Several people who have implemented full-text search systems have told me that users don't really use proximity operators, that they are useful in the implementation, but need not be exposed to the user. Others vehemently disagree. I took the pragmatic approach of leaving it out to see who would complain. Frankly, you are the first to do so. I have discussed proximity searching as a possibility in the following paper: http://www.w3.org/TandS/QL/QL98/pp/murata-san.html Here's an excerpt: In addition, functions for proximity searching might be useful. The following returns elements in which "rose*" and "sweet*" occur within 10 words of each other: LINE[near("rose*", "sweet", 10)] This would match lines like these: A rose by any other name would smell as sweet. Sweet roses grew along the south side of the fence. She rose and smiled sweetly at the purple dwarf under the bucket. Say, has anybody seen my Sweet Gypsy Rose? Proximity searching requires some way to indicate how close the strings must be in order to match. This causes a difficulty when choosing the units in which proximity is measured. In existing full-text systems, distance is frequently measured in terms of words, which raises a number of significant questions regarding internationalization, but is probably an intuitive way to measure distance for most users. I'm not sure whether this is the best approach or not. Do you like this approach? If not, what approach would you prefer? Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 03:52:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:35 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com> References: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com> Message-ID: <14075.1092.778836.678620@localhost.localdomain> Peter Stark writes: > The "binary XML" defined by the WAP Forum is a format for tokenized > XML. It's supported by cellular phones with WAP browsers, e.g. > http://www.nokia.com/phones/7110/index.html. Element and attribute > names are replaced by binary values to make parsing cheaper in the > client. It does, however, not support all XML features. For > example, XML namespaces are not supported. If named elements and attributes are supported, then so are namespaces; the client just has to do a little more work to find them. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 03:54:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:35 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: References: <14074.17776.784121.47587@localhost.localdomain> Message-ID: <14075.1169.878710.98496@localhost.localdomain> Lars Marius Garshol writes: > Also, if element declarations are included, I suppose notations > should be, too. They're there already in the SAX 1.0 DTDHandler, since XML 1.0 requires processors to report notations. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 04:07:05 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:35 2004 Subject: XML and (K)Office In-Reply-To: <4.1.19990326091444.00bac620@steptwo.com.au> References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101> <14074.8435.653789.348824@localhost.localdomain> <4.1.19990326091444.00bac620@steptwo.com.au> Message-ID: <14075.1332.510399.82540@localhost.localdomain> James Robertson writes: [on using Namespaces in spreadsheet formats] > It: > > * Breaks validation. We are no longer able to ensure that the > files we are reading/creating are correct and useful. DTD validation cannot guarantee that a file is correct or useful; it can only guarantee that it matches a few BNF-like productions (that's helpful in itself because it allows some code simplication, but not as much as some people let on). DTDs are great for guided authoring, but that's a different area. Furthermore, Namespaces itself doesn't break DTD validation -- it's a different layer. The possibility of receiving unexpected information does break validation, but it does so with or without namespaces; with namespaces, at least, you can clearly distinguish what's been added. > * Still has the variations between applications, so that a reader > of a given format still needs to know 100% about what is that > format. Not at all -- it can use what it understands and apply simple rules to the rest (ignore it as in RDF, skip to the top level and process the children, etc.). > Without the rigour of a DTD, we've got nothing. DTDs may be rigorous or lax, depending on the designer. Here's a DTD for spreadsheets: Just dump in the comma-delimited file, and escape any XML delimiters. Now you have a DTD, and you still have nothing. > Particularly since this data may well live long, and is not > some transient "sent over the web" data. That means that the format should be well-documented and validatable; DTDs can help (and it's nice that they work with off-the-shelf tools), but they're not worth much by themselves. > How will future users make sense of the format without > a DTD? I've written dozens (hundreds?) of DTDs and a book on them, so I'm quite comfortable saying that a DTD does not guarantee that users can make sense of a format. It is helpful in many ways, but good documentation, examples, sample code, etc. are at least as important. Would you like to code in C++ based only on the BNF for the language? Of course. Is it possible to code in C++ without ever having seen the BNF (or whatever they use) in the ANSI spec? Thousands do, some well and some poorly. That said, I think that DTDs are wonderfully useful and will be around for a long time -- I doubt that any other schema standards that come out will be nearly so light-weight. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Mar 26 04:26:51 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:10:35 2004 Subject: SAX2: AttributeList2 and EntityRefList References: <14074.16928.163619.681099@localhost.localdomain> Message-ID: <36FB08E5.DA7CEDC2@jclark.com> David Megginson wrote: > As John Cowan has pointed out, the XML 1.0 REC requires that > processors report unexpanded entity references, and presumably that > applies to references in attribute values as well as elsewhere; as a > result, it is impossible to treat an XML attribute value simply as a > string. I'm not seeing this. All I can find is: > 4.4.3 Included If Validating > > When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the > processor must include its replacement text. If the entity is external, and the processor is not attempting to > validate the XML document, the processor may, but need not, include the entity's replacement text. If a > non-validating parser does not include the replacement text, it must inform the application that it recognized, > but did not read, the entity. 4.4.3 applies only to external parsed general entities. External parsed entities are not allowed in attribute values. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Mar 26 04:54:09 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:35 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> <36FAEE03.E435EB6D@lig.net> Message-ID: <36FB0F47.FE758A01@prescod.net> "Stephen D. Williams" wrote: > > Also imagine that when taking an XML 'document' into a program you > could search, modify, or copy the object without generating thousands > of object creates and deletes/garbage collection hits. I guess this is the part I don't understand. I can see how in C++ I could just load a chunk of binary gunk and use casts to convince the computer that it is really objects but I don't see how to do that in Java, Python, Perl or other high level languages. And even if you get it working really fast in Java will those same binary objects load quickly in any other language? Are you going to lazily build objects as the application walks the tree? > The point is that bXML is not designed to be editable by a text > editor since it has more of a 'structured' layout, sort of like a > filesystem. But note that a filesystem is not meant to be interpreted by more than one program, especially not by programs written in multiple languages. You call into the kernel (probably written in C) and it interprets the bits for you. Anyhow, if your "bXML" can be ASCII or Unicode then please make it so. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Mar 26 05:35:03 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:35 2004 Subject: SAX2: AttributeList2 and EntityRefList In-Reply-To: <14074.16928.163619.681099@localhost.localdomain> from "David Megginson" at Mar 25, 99 09:17:19 am Message-ID: <199903260640.BAA13403@locke.ccil.org> David Megginson scripsit: > So, after some thought, here's what I came up with. This is a special > interface providing indexes to zero or more entity references in a > literal string (i.e. an attribute value). The indices are based on > whatever array indices the programming language is using, exclusive of > Unicode problems with combining characters, etc. (i.e. any > normalisation must already have taken place). What about references to unknown entities, though? They don't contribute any characters at all, and so don't fit your model. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Mar 26 05:39:31 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:35 2004 Subject: Proposed new kind of SAX2 thing, with example Message-ID: <199903260644.BAA13616@locke.ccil.org> I believe there should be some way within SAX2 to ask for parser properties (in the JavaBeans sense). One example is the architectural DTD public ID, which XAF provides access to but can't report because it doesn't fit the SAX event model. Another case is the current element stack. Every parser (or almost every parser) has to keep one of these around, and it would be useful to have "currentStackDepth" and "stackedElementType[n]" properties. What's needed is to have some means of discovery. Perhaps it's just enough to use the JavaBeans mechanism. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shyutz at ms1.hinet.net Fri Mar 26 07:38:34 1999 From: shyutz at ms1.hinet.net (Kevin Hsu) Date: Mon Jun 7 17:10:35 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <002401be7757$f99675c0$15cd4acb@flag.com.tw> Can anyone tell me how to print the XML document as I see on the screen in IE 5.0, thanks in advance. Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/9fee3b9d/attachment.htm From paul.janssens at skynet.be Fri Mar 26 08:45:20 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:35 2004 Subject: Is there anyone working on a binary version of XML? References: <002801be7711$6f8f3c40$0100a8c0@sammy> <36FACA9B.4D7CD9B2@lig.net> Message-ID: <36FB488F.33B7@skynet.be> A simple solution would be to serialize using a PN like format for your file and have the arrity of each node in the file. It's slower than having an offset table per node, but faster when inserting or deleting as the data is almost utterly context free. If this is too slow, you migh add offset tables per entity-node, but'you'll have to update these when inserting or deleting, working up the parent chain. The first is better for authoring, and the second for querying I'd say. Paul Janssens - paul.janssens@skynet.be Stephen D. Williams wrote: > > This is not what I meant. > > XML has mechanisms to store binary data as characters using all the standard > methods. > > What I'm talking about is using data that is structured in a directly > addressable way (think pointers, arrays, indexes, offsets) to represent the > structure and content of an XML tree. My actual proposal is a bit more > complicated than that because I want other types of optimizations for in-memory > processing, but that is one of the roots of the idea. In other words, after > loading the tree would be directly addressable (SAX or DOM) without any parsing > (or very limited steps). A typical server might in fact support both XML and > XMLb queries and responses. > > sdw xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From l-arcini at uniandes.edu.co Fri Mar 26 09:00:17 1999 From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.) Date: Mon Jun 7 17:10:35 2004 Subject: Expat using something other than Visual C++ Message-ID: <00b401be773d$eb6afbc0$0100000a@phoebe> Hi everyone, Has anyone tried to use expat in an environment different than Visual C++? Any successful attempts using C++ Builder? Thanks for your help, FAA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Mar 26 09:23:53 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:35 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1722@EUKBANT101> It appears that IE5 converts internally to HTML (with the XSL style sheet), so the answer is that you can't. Even a save to disk saves the HTML AFAIK. Try using Mozilla - it does things right, and displays XML+XSL remarkably well considering it's at least 6 months away from release. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Kevin Hsu [SMTP:shyutz@ms1.hinet.net] > Sent: Friday, March 26, 1999 6:55 AM > To: XML Developers' List > Subject: how to print the XML document in IE 5.0 > > Can anyone tell me how to print the XML document as I see on the screen in > IE 5.0, thanks in advance. > ? > Kevin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 26 09:24:22 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:35 2004 Subject: DTDDeclHandler and DTDLexicalHandler Message-ID: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de> This may have already been answered, but how do DTDDeclHandler and DTDLexicalHandler work together? That is, if I have the following: what is the sequence of callbacks? And even if this is well-defined, what good is the lexical information in this case anyway, since I can't determine what characters in the DTD came before and after the entity usage. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at qub.com Fri Mar 26 09:40:55 1999 From: paul at qub.com (Paul at Sunnyvale) Date: Mon Jun 7 17:10:35 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <002e01be776d$2114bb60$c0d4d6cf@g0f2n0> >It appears that IE5 converts internally to HTML (with the XSL style sheet), >so the answer is that you can't. Even a save to disk saves the HTML AFAIK. >Try using Mozilla - it does things right, and displays XML+XSL remarkably >well considering it's at least 6 months away from release. Could you please provide the url that will show Mozilla's capability to display XML + _XSL_ ? Or do you mean XML + CSS ? Rgds.Paul. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Fri Mar 26 09:49:39 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:10:35 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <36FAEE03.E435EB6D@lig.net> Message-ID: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> > Then imagine you can write or communicate the object to other > systems simply with IO > operations with no processing involved. Then imagine that the IO > is async and very cheap and > that you are processing thousands of transactions per second, > most of which generate > fundamentally little processing steps. I just want to clarify my understanding of this thread: you're discussing a binary format which is analagous to the internal representation of an XML document (a DOM tree), and which can be stored, used and manipulated without revisiting the original XML text? Wouldn't a (undoubtedly naive) implementation of this be simply serialising the object graph to disk, or through an I/O stream? This is obviously easy in Java, and again is only obviously beneficial if the serialised object graph is more 'compact' (which I believe is at least partly behind your desire) than the original textual version? Just a brain check on my part ;) L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Mar 26 10:13:50 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:36 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1723@EUKBANT101> > -----Original Message----- > From: Paul at Sunnyvale [SMTP:paul@qub.com] > > >It appears that IE5 converts internally to HTML (with the XSL style > sheet), > >so the answer is that you can't. Even a save to disk saves the HTML > AFAIK. > >Try using Mozilla - it does things right, and displays XML+XSL remarkably > >well considering it's at least 6 months away from release. > > > Could you please provide the url that will show Mozilla's capability to > display > XML + _XSL_ ? Or do you mean XML + CSS ? > Oops. Mozilla on it's own only displays XML+CSS. I think they hope to have full XSL support on release. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alberto.reggiori at jrc.it Fri Mar 26 10:45:25 1999 From: alberto.reggiori at jrc.it (Alberto Reggiori) Date: Mon Jun 7 17:10:36 2004 Subject: Is there anyone working on a binary version of XML? References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <36FB6534.FDF0192D@jrc.it> Leigh Dodds wrote: > Wouldn't a (undoubtedly naive) implementation of this be simply serialising > the object graph to disk, or through an I/O stream? This is obviously easy > in Java, and again is only obviously beneficial if the serialised object > graph is more 'compact' (which I believe is at least partly behind your > desire) than the original textual version? > I am writing a Web application that provides an Open Web space for secondary schools in Europe where users can interact with an oodbms thourgh a treeview like cut/paste/rename/edit paradigm using normal 16MB pentium PCs and ISDN connections. One of the big issues of this application is to provide a quick generation and rendering of those treeviews inside normal browsers using javascript. The initial idea was to use a bare bone javascript xml parser on the client (jeremie.com like) to parse and create the in-memory data structure (DOMish) of thoses views, but that solution reveals not scaling when the user requests some 200/300 folders. The actual solution to those problems is to use a little hack on the server that generates directly html docs with the parsed js structure in as nested arrays and hashes that do _not_ need parsing anymore. The files with the "serialised" trees are a bit larger but the rendering performances are a _lot_ better. The code is still able to display textual xml treeviews. I think would be really useful to have a standard and more compact way to serialise (dump binary groves/structures) to some specific format (java, javascript,C,C++) or in a stream of "events" instead of pure text _only_. I am not saying that XML should be binary, but that the parsing businness sometimes is an issue. Just another brain dump. Alberto -------------- next part -------------- A non-text attachment was scrubbed... Name: alberto.reggiori.vcf Type: text/x-vcard Size: 325 bytes Desc: Card for Alberto Reggiori Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/7acb51fa/alberto.reggiori.vcf From costello at mitre.org Fri Mar 26 12:02:12 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it> Message-ID: <36FB76BA.EBE5172D@mitre.org> Why doesn't XML support the notion of an unordered list of elements, i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, DTDs do not support Bags, but XML has no such inherent limitation? Does DCD support Bags? /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Fri Mar 26 12:40:50 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? In-Reply-To: <36FB76BA.EBE5172D@mitre.org> Message-ID: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk> > Why doesn't XML support the notion of an unordered list of elements, > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > DTDs do not support Bags, but XML has no such inherent limitation? Does > DCD support Bags? /Roger An XML newbie writes.... Isn't this an unordered list of elements? Which appears to have order, but as elements are optional and the group is repeatable the ordering isn't enforced. Although the bag can't be empty. So, you could have... Which allows an empty BAG Or am I completely wrong? Cheers, L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Mar 26 13:03:33 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> > -----Original Message----- > From: Roger L. Costello [SMTP:costello@mitre.org] > > Why doesn't XML support the notion of an unordered list of elements, > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > DTDs do not support Bags, but XML has no such inherent limitation? Does > DCD support Bags? /Roger > Unordered list in XML:

  • An Item
  • Another Item
Ordered list in XML:
  1. Item 1
  2. Item 2
The point is an unordered list is an application level issue, not an XML level issue - it's easy to implement one at your application level. Nay - I would go as far as to say it's trivial. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 13:27:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:36 2004 Subject: SAX2: AttributeList2 and EntityRefList In-Reply-To: <199903260640.BAA13403@locke.ccil.org> References: <14074.16928.163619.681099@localhost.localdomain> <199903260640.BAA13403@locke.ccil.org> Message-ID: <14075.35593.947500.623901@localhost.localdomain> John Cowan writes: > David Megginson scripsit: > > > So, after some thought, here's what I came up with. This is a special > > interface providing indexes to zero or more entity references in a > > literal string (i.e. an attribute value). The indices are based on > > whatever array indices the programming language is using, exclusive of > > Unicode problems with combining characters, etc. (i.e. any > > normalisation must already have taken place). > > What about references to unknown entities, though? They don't contribute > any characters at all, and so don't fit your model. Actually, they fit in fine -- the start and end positions will simply be the same. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Fri Mar 26 13:28:10 1999 From: harvey at eccnet.eccnet.com (Betty L. Harvey) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? In-Reply-To: <36FB76BA.EBE5172D@mitre.org> Message-ID: Roger: I am not sure what you mean by "Bags" but XML supports any type of list. It also supports content tables which are pretty cool: As an example: Item1 Item2 Example Content Tagged Table 1 My Part $1.00 10 2 My Part 2 $2.00 20 Depending on your application you can do some pretty interesting things with the parts list. You can do the same thing with a list if required. I am not sure if this is what you were looking for. Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ On Fri, 26 Mar 1999, Roger L. Costello wrote: > Why doesn't XML support the notion of an unordered list of elements, > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > DTDs do not support Bags, but XML has no such inherent limitation? Does > DCD support Bags? /Roger > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Fri Mar 26 13:29:15 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? Message-ID: <3.0.32.19990326082427.009082c0@bl-mail2.corpeast.baynetworks.com> Well, what about . That gives a bag of anything. Or am I misinterpreting the spec? Thanks, Shekhar Kshirsagar At 12:40 PM 3/26/99 -0000, Leigh Dodds wrote: >> Why doesn't XML support the notion of an unordered list of elements, >> i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, >> DTDs do not support Bags, but XML has no such inherent limitation? Does >> DCD support Bags? /Roger > >An XML newbie writes.... > >Isn't this an unordered list of elements? > > > >Which appears to have order, but as elements are optional >and the group is repeatable the ordering isn't enforced. Although >the bag can't be empty. So, you could have... > > > > >Which allows an empty BAG > >Or am I completely wrong? > >Cheers, > >L. > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 13:29:20 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:36 2004 Subject: Proposed new kind of SAX2 thing, with example In-Reply-To: <199903260644.BAA13616@locke.ccil.org> References: <199903260644.BAA13616@locke.ccil.org> Message-ID: <14075.35687.586960.200728@localhost.localdomain> John Cowan writes: > I believe there should be some way within SAX2 to ask for parser > properties (in the JavaBeans sense). One example is the > architectural DTD public ID, which XAF provides access to but can't > report because it doesn't fit the SAX event model. Use the following from Parser2 (n?e ModParser): public abstract Object get (String prop) throws SAXNotSupportedException; All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 13:34:28 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:36 2004 Subject: DTDDeclHandler and DTDLexicalHandler In-Reply-To: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de> References: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de> Message-ID: <14075.35813.308202.771903@localhost.localdomain> Ronald Bourret writes: > This may have already been answered, but how do DTDDeclHandler and > DTDLexicalHandler work together? That is, if I have the following: > > > > > what is the sequence of callbacks? You'll lose the entity-boundary information in this case. What you'd get back is internalEntityDecl("foo", true, "foo CDATA #REQUIRED"); attributeDecl("bar", "foo", "CDATA", null, ATTRIBUTE_REQUIRED, refs); > And even if this is well-defined, what good is the lexical > information in this case anyway, since I can't determine what > characters in the DTD came before and after the entity usage. You're right, but I think that we're taking this too far for the SAX core. SAX2 is specifically set up so that people can define new handler types, so it is possible to come up with something that provides this level of reporting, but it will have to be outside of the SAX core. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Fri Mar 26 13:36:13 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> Message-ID: <36FB8C85.BEFC8103@mitre.org> Matt, Then let me ask another question - why do DTDs not allow me to specify an unordered list of elements? For example, With this notation I am trying to indicate that an XML document that conforms to this DTD must have a element which has three child elements - , , and , and these child elements can be in any order. Isn't this a useful thing? I have had a number of times where I wish that I could do this. I gather from your message that you are saying that it is not a limitation of XML, but rather a limitation of DTDs? How about DCDs? Thanks. /Roger Matthew Sergeant (EML) wrote: > > > -----Original Message----- > > From: Roger L. Costello [SMTP:costello@mitre.org] > > > > Why doesn't XML support the notion of an unordered list of elements, > > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > > DTDs do not support Bags, but XML has no such inherent limitation? Does > > DCD support Bags? /Roger > > > Unordered list in XML: > >
    >
  • An Item
  • >
  • Another Item
  • >
> > Ordered list in XML: > >
    >
  1. Item 1
  2. >
  3. Item 2
  4. >
> > The point is an unordered list is an application level issue, not an > XML level issue - it's easy to implement one at your application level. Nay > - I would go as far as to say it's trivial. > > Matt. > -- > http://come.to/fastnet > Perl on Win32, PerlScript, ASP, Database, XML > GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V > !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 13:42:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:36 2004 Subject: Why doesn't XML have Bag? In-Reply-To: <36FB76BA.EBE5172D@mitre.org> References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it> <36FB76BA.EBE5172D@mitre.org> Message-ID: <14075.36327.509783.485757@localhost.localdomain> Roger L. Costello writes: > Why doesn't XML support the notion of an unordered list of elements, > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > DTDs do not support Bags, but XML has no such inherent limitation? Does > DCD support Bags? /Roger XML DTDs can constrain the content of a bag just fine: (a|b|c|d|e|f)* XML DTDs cannot constrain the content of a set (where each element may appear exactly once, in any order). This is not an SGML DTD limitation, since in SGML you can use (a&b&c&d&e&f) You can simulate this in XML DTDs, but the content models become absurdly large. This is not to say that you cannot have a set in XML even *with* DTD validation; it's just that DTD validation will not catch the errors. For example, either (a|b|c|d|e|f)* or even ANY will allow a set, but they will not catch the error where the same element appears twice. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 13:43:04 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:36 2004 Subject: A Simple Thought References: <005a01be7787$02189b90$0100a8c0@sammy> Message-ID: <36FB8EE7.115C77B1@lig.net> This is in fact exactly the kind of thing that I am thinking, with at least a couple other optimizations thrown in to make processing in-place in Java fast. sdw "Samuel R. Blackburn" wrote: > You know, if you parse the XML into a carefully designed data structure, > you could write that structure to a file. To re-read the data, you would > simply memory map the file (or put the structure into a shared memory > segment). If the structure is designed so offsets are used instead of > pointers, you could navigate is quickly and not have to worry about > memory addresses involved. The OS will only page in those portions > of the file that are really used. > > Just a thought, > > Sam > > -----Original Message----- > From: Stephen D. Williams > To: xml-dev@ic.ac.uk > Date: Thursday, March 25, 1999 10:08 PM > Subject: Re: Is there anyone working on a binary version of XML? > > >"Simon St.Laurent" wrote: > > > >> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote: > >> >>I know, I know, this is anathema to what many of you feel is the > >> >>essence of XML, and I agree to a point. > >> > > >> >It's not so much about feelings, as about contradicting the XML spec. > >> > > >> >[...] > >> > > >> >Applying XML concepts to a binary data format sounds interesting and > >> >potentially useful, but it wouldn't be XML. > >> > >> One of these days I'd really love to stop talking about what is and isn't > >> XML, though I know it's fun, and start talking about what we can do with > >> XML and XML-like structures, whether they are SAX event flows, DOM trees, > >> or binary formats that build on an XML foundation. > >> > >> We might even get some real work done - and it might even be fun. > > > >I agree with the sentiment Simon. > > > >I'm required (or am requiring myself) to get a lot of real work done very > >quickly in the next > >6 months hence my focus... > > > >Semantically, I am talking about using XML. After parsing and creating a > >DOM tree or SAX > >events, you no longer have XML but a data structure semantically equivalent > >to an XML > >document. Another way to think about what I'm proposing is that it is a > >cache of the data > >structures produced from processing an XML document, cast in a openly > >documented data > >structure that is already flattened and ready for IO. > > > >In fact, this is how I arrived at this design after following a few other > >design constraints > >and observations. Of course from there it is a short stop to say that you > >can throw away the > >'external' XML representation if you can recreate it from XMLb. > > > >My scheme makes parsing of XML a non-issue. If I only have that advantage > >within my closed > >system, so be it, converting to and from XML for external purposes is in > >fact what I intend to > >do. > > > >In my case, I'm architecting a high speed clustering system, primarily > >targeted at Linux/Unix > >and Java. In this kind of system of course you are splitting applications > >into many servers. > >Of course the communication between those nodes is really internal > >application communication, > >the equivalent of that DOM tree, so it makes sense to optimize it. Think > of > >it this way, > >you'd seldom design a large app where every method needs to parse the XML > >text block passed to > >it to get a DOM tree (or SAX events) if the calling method has a DOM tree > >that it could just > >pass. > > > >sdw > > > >> Simon St.Laurent > >> XML: A Primer > >> Sharing Bandwidth / Cookies > >> http://www.simonstl.com > > > > > >-- > >OptimaLogic - Finding Optimal Solutions > >Web/Crypto/OO/Unix/Comm/Video/DBMS > >sdw@lig.net Stephen D. Williams Senior Consultant/Architect > >http://sdw.st > >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax > >5Jan1999 > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >(un)subscribe xml-dev > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > >subscribe xml-dev-digest > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Mar 26 13:47:12 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:37 2004 Subject: Proposed new kind of SAX2 thing, with example Message-ID: <008101be778f$d4ddffe0$c8a8a8c0@thing1> From: John Cowan >I believe there should be some way within SAX2 to ask for >parser properties (in the JavaBeans sense). One example is the >architectural DTD public ID, which XAF provides access to >but can't report because it doesn't fit the SAX event model. Why not use the get(featureID)? >Another case is the current element stack. Every parser (or almost >every parser) has to keep one of these around, and it would be >useful to have "currentStackDepth" and "stackedElementType[n]" >properties. > >What's needed is to have some means of discovery. Perhaps it's just >enough to use the JavaBeans mechanism. One of the ideas behind MDSAX was to have a shared element stack. But if SAX2 developed such a concept, then: 1. A parser has the option of sharing its element stack and 2. When a parser doesn't share its element stack, a filter could be used. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 13:52:28 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:37 2004 Subject: Is there anyone working on a binary version of XML? References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <36FB9100.A163ABE9@lig.net> Leigh Dodds wrote: > > Then imagine you can write or communicate the object to other > > systems simply with IO > > operations with no processing involved. Then imagine that the IO > > is async and very cheap and > > that you are processing thousands of transactions per second, > > most of which generate > > fundamentally little processing steps. > > I just want to clarify my understanding of this thread: you're discussing > a binary format which is analagous to the internal representation of an > XML document (a DOM tree), and which can be stored, used and manipulated > without revisiting the original XML text? > > Wouldn't a (undoubtedly naive) implementation of this be simply serialising > the object graph to disk, or through an I/O stream? This is obviously easy > in Java, and again is only obviously beneficial if the serialised object > graph is more 'compact' (which I believe is at least partly behind your > desire) than the original textual version? Yes, that would acheive part of what I'm getting at, but not nearly enough. You see I am addressing several different performance problems with processing in Java at the same time so the solution is a bit more holistic. In concept, what I'm getting at is close to using a serialization of a DOM tree, however the point is to avoid any transformations (even deserialization/serialization) when possible but still have a DOM/SAX or even JGL like access to the tree. sdw > > > Just a brain check on my part ;) > > L. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Mar 26 13:53:43 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:37 2004 Subject: Fast filter support in SAX2 Message-ID: <009201be7790$c0a6b7a0$c8a8a8c0@thing1> I'd like to suggest another method in Parser2: public String unique(String); as well as a featureID for requesting unique element and attribute names. The thought is to bring the speed of filters closer to the speed of doing things within a parser. If a parser supports both the unique feature and provides access to its element stack, then we are well on the way to being able to implement Simpon's layered parser. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Fri Mar 26 14:04:01 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? Uh, "set" References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it> <36FB76BA.EBE5172D@mitre.org> <14075.36327.509783.485757@localhost.localdomain> Message-ID: <36FB93E5.F662B56A@mitre.org> Thanks Dave for clarifying terminology. It is "set" that I meant, not "bag". Just to make certain that I understand, an XML DTD cannot express the following: "A element contains exactly three child elements: one instance of , one instance of , and one instance of , and these child elements can appear in any order." Correct? /Roger P.S. Attributes can be listed in any order in an XML document, regardless of the order that they are listed in the DTD. Right? David Megginson wrote: > > Roger L. Costello writes: > > > Why doesn't XML support the notion of an unordered list of elements, > > i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, > > DTDs do not support Bags, but XML has no such inherent limitation? Does > > DCD support Bags? /Roger > > XML DTDs can constrain the content of a bag just fine: > > (a|b|c|d|e|f)* > > XML DTDs cannot constrain the content of a set (where each element may > appear exactly once, in any order). This is not an SGML DTD > limitation, since in SGML you can use > > (a&b&c&d&e&f) > > You can simulate this in XML DTDs, but the content models become > absurdly large. > > This is not to say that you cannot have a set in XML even *with* DTD > validation; it's just that DTD validation will not catch the errors. > For example, either > > (a|b|c|d|e|f)* > > or even > > ANY > > will allow a set, but they will not catch the error where the same > element appears twice. > > All the best, > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Mar 26 14:10:42 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? Message-ID: <005c01be7792$776f8380$4df96d8c@NT.JELLIFFE.COM.AU> From: Roger L. Costello >Why doesn't XML support the notion of an unordered list of elements, >i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, >DTDs do not support Bags, but XML has no such inherent limitation? Does >DCD support Bags? /Roger Answer A: XML does have a way to support Bags: its called RDF. Answer B: SGML DTDs could support bags, because they had an operator "&" to mean required, but in any order. XML DTDs do not have this because everyone said it was so difficult to implement. But then many people said oops, because it would have been nice for database data. Answer C: XML elements have order as a property. However, whether that property is significant in the context of a document type depends on the document type, and sometimes just on the kind of processing being performed at that stage in the document's life. So you could just as easily say that XML has bags but no sets. Answer D: XML is not a data modeling language. It is a data-model modeling language. So you decide what semantics you are to put. This takes place entirely outside the area of what DTD's attempt to do, which is just to provide a simple grammar for the data-model modeling. Answer E: XML does have a way to support Bags: its called architectures. On any element you attach an attribute that ties it to some other element with known properties. For example, you tie your parent element to html:ul for a bag and html:ol for a set. Answer F: There are whole areas of fundamental semantic ways to slice things: you want sets and bags, I want rhetorical relationships (I would love if I could point at any element and know what the appropriate heading for it was; I would love it if that heading was carted around during cutting and pasting.) If you think bags and sets are really important, then encourage the schema working group to include that information. Take your pick! Rick Jelliffe Author: The XML & SGML Cookbook: Recipes for Structured Information Prentice Hall, ISBN 0-13-614223-0 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Fri Mar 26 14:12:35 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? In-Reply-To: Your message of "Fri, 26 Mar 1999 08:32:53 EST." <36FB8C85.BEFC8103@mitre.org> Message-ID: <199903261411.CAA11212@aniwa.sky> > Matt, > > Then let me ask another question - why do DTDs not allow me to specify > an unordered list of elements? For example, > > > > With this notation I am trying to indicate that an XML document that > conforms to this DTD must have a element which has three child > elements - , , and , and these child elements > can be in any order. Isn't this a useful thing? I have had a number of > times where I wish that I could do this. Is it a useful thing? It might be nice for humans entering the data to be unconstrained in the order in which they can enter data, but if you know exactly what elements must exist within kitchen, then does it limit you to pre-define the order they appear in the document. I suppose if you want the order to denote something about the position of the elements within your physical kitchen, then you've lost something, but attributes are probably a better solution for storing this information about the elements within your kitchen. Andrew McNaughton -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Mar 26 14:38:02 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? In-Reply-To: <36FB8C85.BEFC8103@mitre.org> References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> Message-ID: * Roger L. Costello | | Then let me ask another question - why do DTDs not allow me to specify | an unordered list of elements? For example, | | | | With this notation I am trying to indicate that an XML document that | conforms to this DTD must have a element which has three child | elements - , , and , and these child elements | can be in any order. Isn't this a useful thing? Sure it is, and SGML has it already: | I gather from your message that you are saying that it is not a | limitation of XML, but rather a limitation of DTDs? It is a limitation of DTDs and was introduced because without this operator element content models are easily mapped to finite state automatons, but the introduction of the '&' separator makes automaton generation much more difficult. Existing SGML parsers already do this, and there are some research papers giving algorithms for this, but the designers felt that this was one of the things that would have to go in the simplification from SGML to XML. | How about DCDs? DCDs have no official standing, they're just a proposal to the W3C. XML Schemas, when they are defined, may (or may not) have this for all I know. If they do we might as well add it to DTDs too. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 15:01:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:37 2004 Subject: SAX: Modified DTDDeclHandler Message-ID: <14075.41132.504650.207777@localhost.localdomain> Here's another attempt at the SAX2 DTDDeclHandler, adding element type declarations (the handlerID is http://xml.org/sax/handlers/dtd-decl): ====================8<====================8<==================== // DTDDeclHandler.java -- receive extended DTD declarations // $Id: DTDDeclHandler.java,v 1.1 1999/03/26 14:58:47 david Exp david $ package org.xml.sax; public interface DTDDeclHandler extends SAX2Handler { public final static int MODEL_ELEMENTS = 1; public final static int MODEL_MIXED = 2; public final static int MODEL_ANY = 3; public final static int MODEL_EMPTY = 4; public final static int ATTRIBUTE_DEFAULTED = 1; public final static int ATTRIBUTE_IMPLIED = 2; public final static int ATTRIBUTE_REQUIRED = 3; public final static int ATTRIBUTE_FIXED = 4; public abstract void elementDecl (String name, int modelType, String model) throws SAXException; public abstract void attributeDecl (String element, String name, String type, String defaultValue, int defaultType, EntityRefList entityRefs) throws SAXException; public abstract void externalEntityDecl (String name, boolean isParameterEntity, String publicId, String systemId) throws SAXException; public abstract void internalEntityDecl (String name, boolean isParameterEntity, String replacementText) throws SAXException; } // end of DTDDeclHandler.java ====================8<====================8<==================== To this take, I've added the elementDecl() callback. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Fri Mar 26 15:15:34 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> Message-ID: <36FBA3DC.59DA@skynet.be> Lars Marius Garshol wrote: > ... > It is a limitation of DTDs and was introduced because without this > operator element content models are easily mapped to finite state > automatons, but the introduction of the '&' separator makes automaton > generation much more difficult. > > Existing SGML parsers already do this, and there are some research > papers giving algorithms for this, but the designers felt that this > was one of the things that would have to go in the simplification from > SGML to XML. > Please correct me if I am wrong here but isn't that trivial? (you may get a BIG automaton, but it's not difficult to generate) X -> A & B & C ; can be expressed as X -> A X_A | B X_B | C X_C ; X_A -> B X_AB | C X_AC ; X_B -> A X_AB | C X_BC ; X_C -> A X_AC | B X_BC ; X_AB -> C X_ABC ; X_BC -> A X_ABC ; X_AC -> B X_ABC ; X_ABC -> ; and it's 'easily' visualised by the number of possible shortest paths between two opposing points on a hypercube. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 15:19:57 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? Uh, "set" In-Reply-To: <36FB93E5.F662B56A@mitre.org> References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it> <36FB76BA.EBE5172D@mitre.org> <14075.36327.509783.485757@localhost.localdomain> <36FB93E5.F662B56A@mitre.org> Message-ID: <14075.41969.677946.202551@localhost.localdomain> Roger L. Costello writes: > Thanks Dave for clarifying terminology. It is "set" that I meant, not > "bag". Just to make certain that I understand, an XML DTD cannot > express the following: > > "A element contains exactly three child elements: one instance > of , one instance of , and one instance of , > and these child elements can appear in any order." > > Correct? /Roger More or less. Technically, you *can* express this constraint with an XML DTD: ((sink, ((stove, refrigerator) | (refrigerator, stove))) | (stove, ((sink, refrigerator) | (refrigerator, sink))) | (refrigerator, ((sink, stove) | (stove, sink)))) Obviously, things get unmanageable if the set grows a little bigger. In an SGML DTD, you would use (sink & stove & refrigerator) but in practical use, this never worked that well for documents except in the special case of legacy-data conversion (it confused people using authoring tools and generally made processing unnecessarily difficult), and most SGML gurus strongly deprecated it. XML is hitting a slightly different usage domain (less emphasis on documents, more on data), so perhaps it might be worthwhile including this in the new schema standard. > P.S. Attributes can be listed in any order in an XML document, > regardless of the order that they are listed in the DTD. Right? Right -- order and repetition are properties of elements but not of attributes. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Mar 26 15:22:23 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:10:37 2004 Subject: Why doesn't XML have Bag? Message-ID: <008c01be779c$17c5e000$38afdccf@ix.netcom.com> In SGML you can put (a&b&c) whicch means that eachelement must appear only once but in any order. The best you can do in XML (without getting rediculosly complicatedis (a|b|c)* which means that they can appear in any order, but there can be any number of them My understanding was that it was ommited because of the requirement "XML software shall be easy to write" It takes only a few lines of C code to validate the second requirement but a LOT more to validate the first. Frank ----- Original Message ----- From: Roger L. Costello To: Cc: ; Roger Costello Sent: Friday, March 26, 1999 6:59 AM Subject: Why doesn't XML have Bag? >Why doesn't XML support the notion of an unordered list of elements, >i.e., a Bag? Perhaps this is a limitation of DTD, not XML? That is, >DTDs do not support Bags, but XML has no such inherent limitation? Does >DCD support Bags? /Roger > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 15:32:55 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:37 2004 Subject: A Simple Thought References: <003c01be7792$eb14ebe0$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <36FBA892.43A462FB@lig.net> Leigh Dodds wrote: > Hmm, I guess with Java you'd use an in-memory buffer and a class > to wrap that buffer so that your accesses to the data would appear > to be ordinary method calls accessing member variables, but actually > just altered/read data at byte offsets in the buffer? Exactly! The class interface would be SAX/DOM/JGL-like but operate on a very efficient representation. The realization that I had is that I typically build very meta-data driven applications and systems and that I seldom have business data models represented by actual classes (in C++, where I learned my lesson, and Java). Since the data is accessed via collection interfaces anyway, the storage can be completely opaque and optimized. > Originally though I thought you were talking about a 'standard' > representation. > Shouldn't you then be avoiding 'other optimizations...to make processing > in-place in Java fast'? Otherwise you're targeting a particular > implementation > language? Ahh, there's the trick. I believe I have most of a design for an data structure that is fast in memory yet is 'flat' and can have its chunks just written out or read in at any point. It builds on some very old ideas I came up with for a language I designed. When viewed as an interchange format, it may not be the most optimal space wise (although it should be better than XML text) but trades a small amount of space for nearly zero processing overhead. There will probably also be a procedure for 'compacting' an object for storage into a database or sending over a slow link vs. the 'fast' format usable between servers in a cluster. I'll be implementing the rest of this shortly and we can have another round of discussion. I'd really like a reference to the one Java project doing something similar. > I'm interested in this (at least in part) as I've been toying with an > application idea which could potentially have a lot of (small) XML documents > built into a complex in-memory object graph. I'm concerned about the size > of the object graph (and managing interconnections amongst nodes) and its > later storage (don't want to have to reparse every time the application > starts). > Serialisation was originally what I was considering. This is exactly the kind of problem I'm thinking of. Since most people use class interfaces to get at the data anyway, there's no need to chew up all the processing time manipulating it behind the scenes in expensive ways. Unfortunately the simple, obvious, traditional ways of building things (especially in C++ and Java) cause massive storms of activity in large programs. (Object creation, initialization, building links, indexing, etc. etc.) sdw > L. > > > -----Original Message----- > > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > > Stephen D. Williams > > Sent: 26 March 1999 13:43 > > To: Samuel R. Blackburn > > Subject: Re: A Simple Thought > > > > > > This is in fact exactly the kind of thing that I am thinking, > > with at least a > > couple other optimizations thrown in to make processing in-place > > in Java fast. > > > > sdw > > > > "Samuel R. Blackburn" wrote: > > > > > You know, if you parse the XML into a carefully designed data structure, > > > you could write that structure to a file. To re-read the data, you would > > > simply memory map the file (or put the structure into a shared memory > > > segment). If the structure is designed so offsets are used instead of > > > pointers, you could navigate is quickly and not have to worry about > > > memory addresses involved. The OS will only page in those portions > > > of the file that are really used. > > > > > > Just a thought, > > > > > > Sam > > > > > > -----Original Message----- > > > From: Stephen D. Williams > > > To: xml-dev@ic.ac.uk > > > Date: Thursday, March 25, 1999 10:08 PM > > > Subject: Re: Is there anyone working on a binary version of XML? > > > > > > >"Simon St.Laurent" wrote: > > > > > > > >> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote: > > > >> >>I know, I know, this is anathema to what many of you feel is the > > > >> >>essence of XML, and I agree to a point. > > > >> > > > > >> >It's not so much about feelings, as about contradicting the > > XML spec. > > > >> > > > > >> >[...] > > > >> > > > > >> >Applying XML concepts to a binary data format sounds interesting and > > > >> >potentially useful, but it wouldn't be XML. > > > >> > > > >> One of these days I'd really love to stop talking about what > > is and isn't > > > >> XML, though I know it's fun, and start talking about what we > > can do with > > > >> XML and XML-like structures, whether they are SAX event > > flows, DOM trees, > > > >> or binary formats that build on an XML foundation. > > > >> > > > >> We might even get some real work done - and it might even be fun. > > > > > > > >I agree with the sentiment Simon. > > > > > > > >I'm required (or am requiring myself) to get a lot of real > > work done very > > > >quickly in the next > > > >6 months hence my focus... > > > > > > > >Semantically, I am talking about using XML. After parsing and > > creating a > > > >DOM tree or SAX > > > >events, you no longer have XML but a data structure > > semantically equivalent > > > >to an XML > > > >document. Another way to think about what I'm proposing is > > that it is a > > > >cache of the data > > > >structures produced from processing an XML document, cast in a openly > > > >documented data > > > >structure that is already flattened and ready for IO. > > > > > > > >In fact, this is how I arrived at this design after following > > a few other > > > >design constraints > > > >and observations. Of course from there it is a short stop to > > say that you > > > >can throw away the > > > >'external' XML representation if you can recreate it from XMLb. > > > > > > > >My scheme makes parsing of XML a non-issue. If I only have > > that advantage > > > >within my closed > > > >system, so be it, converting to and from XML for external > > purposes is in > > > >fact what I intend to > > > >do. > > > > > > > >In my case, I'm architecting a high speed clustering system, primarily > > > >targeted at Linux/Unix > > > >and Java. In this kind of system of course you are splitting > > applications > > > >into many servers. > > > >Of course the communication between those nodes is really internal > > > >application communication, > > > >the equivalent of that DOM tree, so it makes sense to optimize > > it. Think > > > of > > > >it this way, > > > >you'd seldom design a large app where every method needs to > > parse the XML > > > >text block passed to > > > >it to get a DOM tree (or SAX events) if the calling method has > > a DOM tree > > > >that it could just > > > >pass. > > > > > > > >sdw > > > > > > > >> Simon St.Laurent > > > >> XML: A Primer > > > >> Sharing Bandwidth / Cookies > > > >> http://www.simonstl.com > > > > > > > > > > > >-- > > > >OptimaLogic - Finding Optimal Solutions > > > >Web/Crypto/OO/Unix/Comm/Video/DBMS > > > >sdw@lig.net Stephen D. Williams Senior Consultant/Architect > > > >http://sdw.st > > > >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax > > > >5Jan1999 > > > > > > > >xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > > CD-ROM/ISBN 981-02-3594-1 > > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > >(un)subscribe xml-dev > > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > > message; > > >subscribe xml-dev-digest > > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Mar 26 15:38:06 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:10:37 2004 Subject: How about changing the rules? References: Message-ID: <36FBA7FF.F61B67D@infinet.com> Didier PH Martin wrote: > Hi, > > Yesterday night I talked to good friends that work at Netscape (but not for > long now) and I can tell you that this was not about celebrating. We came to > discuss about the free software movement on so on, then came an idea... > > > Several people worked hard in the Linux project, then came Red Hat, big > investments, and now red hat is doing what all the other guys are doing > (that's business no?) protecting their turf and doing money (they are even > more luky than SUN or Microsoft, they are cheap labor to develop their > software - just think about it. We all know that Microsoft has probably the > lowest developement cost in the industry. They let the stock market pay > their exployees :-) but now think about a company having 0$ developement > costs Wow, thats VC dream! Follow developers, is it how you pay your bills? > Sun still own the Java JDK but at least played fair because the code is > developed with their own money. > Microsoft, played hard with all ISVs with their huge appetite for growth but > at least, like sun paid their code production. > Mozilla, again, people working for free and AOL and its stock holders > harvest the results. Just imagine that Sun and adobe put 60 000$ to have a > better XML support for Mozilla. But in the end who will get the millions > rewards. And how much is 60 000$ compared to millions, just a sustenance > given to developers like lord would do in the middle ages with their serf. > Just think about it. I am not saying that Sun or Adobe are doing something > wrong but that the rules of the games or the odds are for the bank, not for > the developers :-) (if you allow my casino analogy). > Basically the actual free software movement seems to follow this pattern: > developers work for free (cheap labor), when testing and proof of concept is > done, someone comes into and reap the rewards and the money. Result, > developers got fun but a modern version of a lord reap the financial > rewards. Do we really want to replicate middle ages patterns? Next year will > be the next millenium, do you really want that kind of order in the future? > What about a world where people could get a just reward for their efforts. > All the efforts we are doing with XML may end up the same way. I do not > speak here for people already paid by W3C or big corpora but about > individual doing all the efforts with their own time, and therefore their > own money. > I agree wholeheartedly with this. Many Linux developers are so dedicated to the Linux platform mostly because the long for the day the see the demise of Microsoft because they are disheartened by how Microsoft exploits the rest of the software industry. However, while doing so they forget that they are just being exploited by someone else. > > Here's the solution that friends and me came about. > Create a company where all participating developers would have stocks. Will > work like open software group but each participant would have ownership. > Customers would get a share too. In this case, we do like Red hat is doing, > packaging the code make it easy to install, document it and _sell_ it. Each > customer would have a stock too. So, when they buy the software, they also > have ownership. I don't know about this. How would you sell shares to customers when there are millions. How would you efficiently disburse dividends to these customers? When you go to your local software store and buy a copy of Red Hat would the store have to issue stock to the customer? Even if you had a mail in program this would still be a logistical nightmare. > So, the idea is: create a company where all participating developers would > have stocks and therefore ownership. Customers would also have stocks and > ownership but would have to buy the software to get ownership. A free > version could be downloaded for free trial. But people using the free trial > version would not have stocks. Perhaps the developers all having stock would not be so bad. You would be following a Waffle-House style of employee ownership of the company (100% of the stock is owned by the employees) or even arguable something more like a model of Goldman Sachs where if you get to be partner you get a certain percentage of the total company profits. Those most dedicated would earn the most rewards. > Results: This time, developers could get a chance to get a return on their > efforts. Just imagine the power of a company having 20 000 owners. As big as > Microsoft! Try managing it and resolving disagreements. In this sense Microsoft or any other behemoth has the efficiency of dictatorship on their side. > Couple years ago, a group of artist came tired of seeing someone else get > all the rewards of their work and then founded United Artist. Then now, > today, what about a new company called "United Developers". Not a bad idea, but most developers I think are of the political attitudes that unionization is evil and that the laissez faire economics are the best way to go. > If the idea seems interesting to you, we can start a list server to discuss > about it and create a new kind of company. Again imagine what 20 000 ,50 000 > or even millions of owners can do. Just stop for a moment and think about > it. > Well, how much money would I be getting and also what if I work harder than the first 30,000 in the lot of 50,000. I would expect to get more compensation than some guy who has no clue what he is doing. The idea has some potential, but someone with some money is going to have to foot the bill initially. Any takers? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 26 16:02:06 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:38 2004 Subject: Modified DTDDeclHandler Message-ID: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de> David Megginson wrote: > Here's another attempt at the SAX2 DTDDeclHandler, adding element type > declarations (the handlerID is http://xml.org/sax/handlers/dtd-decl): [snip] > public final static int MODEL_ELEMENTS = 1; > public final static int MODEL_MIXED = 2; > public final static int MODEL_ANY = 3; > public final static int MODEL_EMPTY = 4; Is it worth distinguishing between elements that can only contain PCDATA and elements that can contain both PCDATA and subelements? I realize that the XML spec doesn't have separate terms for these, but in real life they are very different. A PCDATA-only element is very close to an attribute, while an element containg PCDATA and elements is a very different beast altogether. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 26 16:41:00 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:38 2004 Subject: MDServlet Message-ID: <199903261639.LAA05643@hesketh.net> I've built a small servlet front-end for MDSAX that's highly (perhaps too) configurable. My favorite feature is that you need to know pretty much nothing about SAX or MDSAX to make it work beyond a basic understanding of ContextML, which isn't nearly as difficult. Using this tool, you can use pretty much all the tools (filters) provided with MDSAX to manipulate documents before transmission, and you can add your own filters to MDSAX and control them from MDServlet. Details are available at http://www.simonstl.com/projects/mdservlet/index.html. For my next project, I'm hoping to build a factory class that'll make James Clark's XT easy to fit in the framework, so XSL transformations will be possible. (They aren't at present.) If anyone would like to contribute to that (especially if you've figured out how XT fits in a SAX-based environment), I'd love to hear from you. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 17:18:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:38 2004 Subject: Modified DTDDeclHandler In-Reply-To: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de> References: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de> Message-ID: <14075.49450.220994.285269@localhost.localdomain> Ronald Bourret writes: > Is it worth distinguishing between elements that can only contain PCDATA > and elements that can contain both PCDATA and subelements? I realize that > the XML spec doesn't have separate terms for these, but in real life they > are very different. A PCDATA-only element is very close to an attribute, > while an element containg PCDATA and elements is a very different beast > altogether. You can distinguish that by looking at the normalised content model. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Pzingg at imsisoft.com Fri Mar 26 17:26:01 1999 From: Pzingg at imsisoft.com (Peter Zingg) Date: Mon Jun 7 17:10:38 2004 Subject: Maybe a naive question about XML Data Message-ID: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE> Let's say I develop software primarily for the Windows platform, in the consumer space. Let's say I'd like to get away from my products' proprietary file formats and use XML to allow the transfer of data between my applications, across the web, and into and out of databases. Why wouldn't I want to use the XML Data-derived schema language, data typing, etc., that Microsoft is using in Office 2000 and Internet Explorer 5? I can think of a few reasons why not to use it: No published specification (that I can find, anyway). Microsoft's XML pages refer you to W3C activity on XML Data that's at least 15 months old, and that does not match up closely to the XML published by Office 2000. Using DTD instead of the Microsoft XML schema would allow my data to be validated by more parsers and tools than just the MS/DataChannel parser. Someone else's schema definition might be better (but from what I can see, there is only a request for comments by the competing factions, dated 2/15/99). Then again, there are a few arguments in favor of using it: Microsoft and global domination. You can bet that all of the MS data access and programming tools (ADO, OLE DB, VB, VC++) will be built around it. Already in some kind of production today, even if it's not well documented. What would you do if you wanted to commit to a company-wide XML strategy today? Peter Zingg IMSI xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 26 17:26:29 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:38 2004 Subject: Modified DTDDeclHandler Message-ID: <01BE77B5.F8ACC250@grappa.ito.tu-darmstadt.de> David Megginson wrote: > > Is it worth distinguishing between elements that can only contain PCDATA > > and elements that can contain both PCDATA and subelements? I realize that > > the XML spec doesn't have separate terms for these, but in real life they > > are very different. A PCDATA-only element is very close to an attribute, > > while an element containg PCDATA and elements is a very different beast > > altogether. > > You can distinguish that by looking at the normalised content model. I knew you were going to say that :) The same is also true of the other content models. The parser can determine this very easily and I find it a worthwhile distinction. Any other takers? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Mar 26 17:33:55 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:38 2004 Subject: How about changing the rules? Message-ID: <001a01be77af$8a3a20c0$c8a8a8c0@thing1> I prefer a model which works like a magazine: 1. You have a central theme, say SAX2 and the MDSAX2 component model. 2. There is an annual subscription fee, as well as charges for back issues and collections. 3. Authors/programmers can have any number of arrangements: regular columns, work-for-hire contributions, royalties based on circulation of a given issue, reprints, and inclusion in collections. I've always though authors had a better deal than programmers. But with things like PCs, Java, XML, and component-based programming, there is no real reason not to make the transition. Of course, to add real value, we would want to include branding and testing into the model. Perhaps some kind of rating system. Right now, JXML, Inc., a Delaware Corporation, is "between business models". I'm doing some work for The Open Group right now, but that's it. This might be an interesting vision. We'd need to grow JXML quite a bit to do it, but I'm open to suggestions. Is this a reasonable model? How could it be improved? Any ideas on how we might best proceed? (Open Source, Open Standards, Open Business Models???) Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Paul_Tihansky at vanguard.com Fri Mar 26 17:36:45 1999 From: Paul_Tihansky at vanguard.com (Paul_Tihansky@vanguard.com) Date: Mon Jun 7 17:10:38 2004 Subject: DTD Catalogs Message-ID: <85256740.0060BBCE.00@vgi4mail.vanguard.com> Does anybody know if any of the Java XML Parsers support catalog files? For instance, if I put a Public Indentifier in my DTD declaration without a URL, how would a parser such as XP find the DTD? How do I specify where the parser can find the catalog file? Thanks, Paul Tihansky xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 26 17:48:51 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:38 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <3.0.32.19990326092254.00e4a604@pop.intergate.bc.ca> At 08:54 PM 3/25/99 +0000, Dan Brickley wrote: >Quite so. But there are still initiatives such as > > http://www.wapforum.org/docs/technical.htm > http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf I read some of it, and if you buy the idea that a binary form of XML is useful, it seems quite sensible. I'm agnostic; if they think they need it who are we to tell them they don't? Obviously it has to round-trip with plain ole XML. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 26 17:48:53 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:38 2004 Subject: XML and (K)Office Message-ID: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca> At 09:20 AM 3/26/99 +1000, James Robertson wrote: >Without the rigour of a DTD, we've got nothing. This sentiment is not universally shared. While DTDs are extremely useful and should be constructed as (a small) part of any serious language-design effort, they are in some cases unnecessary (for validation, full-text indexing, and lots of other things) and in other cases insufficient - DTD validation never comes close to real business-logic validation. I am near-schizophrenic these days, running around telling people that yes, they should use DTDs, and simultaneously warning them that there are situations where they fail to be either necessary or sufficient; the kind of mystico- religious attitude above does not help. >How will future users make sense of the format without >a DTD? And what, pray tell, part of a DTD helps you "make sense" of a format? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Mar 26 17:50:15 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:38 2004 Subject: Modified DTDDeclHandler Message-ID: <002301be77b0$586954c0$c8a8a8c0@thing1> From: Ronald Bourret >Is it worth distinguishing between elements that can only contain PCDATA >and elements that can contain both PCDATA and subelements? I realize that >the XML spec doesn't have separate terms for these, but in real life they >are very different. A PCDATA-only element is very close to an attribute, >while an element containg PCDATA and elements is a very different beast >altogether. One advantage of not making the distinction is that you subsequently have a greater freedom to qualify the data held by an element by adding child elements--one of the advantages of content over attributes. As an programmer, I agree with you. I'd like the distinction. But when I think about how an application might mature with time, I'd rather the implementation not make that distinction! Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Fri Mar 26 17:54:19 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:10:38 2004 Subject: Why doesn't XML have Bag? References: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <36FBC99C.28D97D7A@goon.stg.brown.edu> Leigh Dodds wrote: > Isn't this an unordered list of elements? > > > These are equivalent (although the bottom one might be considered 'ambiguous' in SGML terms). I suspect you'll convey your inten- tions a lot better if you use: > -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 26 17:55:54 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:38 2004 Subject: Maybe a naive question about XML Data Message-ID: <3.0.32.19990326095710.00e79c08@pop.intergate.bc.ca> At 09:20 AM 3/26/99 -0800, Peter Zingg wrote: >Let's say I develop software primarily for the Windows platform... > Why >wouldn't I want to use the XML Data-derived schema language, data typing, >etc., that Microsoft is using in Office 2000 and Internet Explorer 5? ... >What would you do if you wanted to commit to a company-wide XML strategy >today? Use DTDs; they will help with a small percentage of your business-logic validation and you'll have to write code to do the rest. When next-gen schemas come along, they'll cover a somewhat larger portion of your business-logic validation in a nice declarative way, and you'll be able to retire some of your code. But not all. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Mar 26 18:05:09 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:38 2004 Subject: Why doesn't XML have Set? References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> Message-ID: <36FBC3C5.AC693F16@prescod.net> "Roger L. Costello" wrote: > > Then let me ask another question - why do DTDs not allow me to specify > an unordered list of elements? For example, > > > > With this notation I am trying to indicate that an XML document that > conforms to this DTD must have a element which has three child > elements - , , and , and these child elements > can be in any order. Isn't this a useful thing? Is it useful? The author or text generator has been given no new flexibility about *what* to write, only the order. What would they indicate through the order, that the sink is "more important" than the stove? That's a stretch. Allowing things in any order may be a convenience for the generator but it will very seldom allow anything interesting to be expressed. And it is an inconvenience for the consumer because now the processing app has to walk around the tree to find out where the Stove is rather than just going to the second child element. > I have had a number of times where I wish that I could do this. That wish usually goes away after a while. You start to wonder if there is really any benefit in complicating your document type and creating more work for yourself without making the language more expressive. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Mar 26 18:12:14 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:38 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <3.0.32.19990326092254.00e4a604@pop.intergate.bc.ca> Message-ID: <002201be77b3$39760600$1b19da18@ne.mediaone.net> Tim Bray wrote: > > At 08:54 PM 3/25/99 +0000, Dan Brickley wrote: > >Quite so. But there are still initiatives such as > > > > http://www.wapforum.org/docs/technical.htm > > http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf > > I read some of it, and if you buy the idea that a binary form of XML > is useful, it seems quite sensible. I'm agnostic; if they think they > need it who are we to tell them they don't? Obviously it has to > round-trip with plain ole XML. -T. > I think what this really is, when you strip out the concept of binary XML, is a suggestion for a compression format tuned for markup streams. There are two distinct issues 1) efficiency of parsing 2) compactness. A standard compression format for XML (ala zip,gzip etc) would be for bandwidth limited applications. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 18:14:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:38 2004 Subject: Maybe a naive question about XML Data In-Reply-To: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE> References: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE> Message-ID: <14075.51958.839689.378413@localhost.localdomain> Peter Zingg writes: > Microsoft and global domination. You can bet that all of the MS > data access and programming tools (ADO, OLE DB, VB, VC++) will be > built around it. I wouldn't make any such bet. I'm not a Windows developer myself, but I've heard a lot of grumbling about MS abandoning its own technologies frequently and with little or no notice. > What would you do if you wanted to commit to a company-wide XML > strategy today? No competent system architect should ever design a system architecture around vendor-specific interfaces and specs except in the direst need (and even then, she's probably better to quit and try to salvage what's left of her reputation). If you use vendor-specific stuff, move it to behind generic interfaces where it can easily be changed without damaging the rest of the system; otherwise, it will be Microsoft (or Sun or IBM or Adobe or Texcel or what have you) who will be deciding the future evolution, maintenance schedule, and lifespan of your system for you, and you'll just be a helpless spectator. So far, that's all system-architecture motherhood and apple pie (or social welfare and poutine, up here in Central Canada). The less obvious point is that open standards like XML, CORBA, etc. also really don't belong in the high-level system design: they should have nothing to do with *what* your system does, only with *how* your system does it, and that's an implementation detail. If there are parts of a planned or existing system that could benefit from using XML in their implementations, then by all means, introduce some XML. Start small to see if and how you're getting a real benefit from the XML, then gradually introduce XML into other parts of the system until you feel confident that you're getting the most benefit from it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Fri Mar 26 18:15:53 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:38 2004 Subject: Why doesn't XML have Bag? References: <008c01be779c$17c5e000$38afdccf@ix.netcom.com> Message-ID: <36FBCE39.4D49@skynet.be> Frank Boumphrey wrote: > > In SGML you can put > (a&b&c) whicch means that eachelement must appear only once but in any > order. ... > > My understanding was that it was ommited because of the requirement > > "XML software shall be easy to write" > > It takes only a few lines of C code to validate the second requirement but a > LOT more to validate the first. > How about expanding it as you parse the DTD (bottom-up coding) node ampersand(node a node b) { return or(concat(a,b),concat(clone(b),clone(a))); } that's just adding three lines to your code. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Mar 26 18:24:49 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:38 2004 Subject: Megginson's Spelling Message-ID: <87256740.0064FE57.00@d53mta03h.boulder.ibm.com> >3. As Donne wrote (cited in the OED), "Busie old foole, unruly > sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes" > (see what I mean about spelling?). > Wasn't "Pedantique" a movie where Sharon Stone walked around lightly clothed and killed her husband because of his horrible spelling? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Mar 26 18:25:52 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:38 2004 Subject: Why doesn't XML have Bag? References: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk> <36FBC99C.28D97D7A@goon.stg.brown.edu> Message-ID: <36FBCFF0.B6A60D79@prescod.net> "Richard L. Goerwitz" wrote: > > Leigh Dodds wrote: > > > Isn't this an unordered list of elements? > > > > > > > > These are equivalent (although the bottom one might be considered > 'ambiguous' in SGML terms). Actually, it would not. Any particular content node list can only present a single path through the content model. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Fri Mar 26 18:37:04 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:39 2004 Subject: Is there anyone working on a binary version of XML? References: <002201be77b3$39760600$1b19da18@ne.mediaone.net> Message-ID: <36FBDB92.56336820@lig.net> Jonathan Borden wrote: > Tim Bray wrote: > > > > At 08:54 PM 3/25/99 +0000, Dan Brickley wrote: > > >Quite so. But there are still initiatives such as > > > > > > http://www.wapforum.org/docs/technical.htm > > > http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf > > > > I read some of it, and if you buy the idea that a binary form of XML > > is useful, it seems quite sensible. I'm agnostic; if they think they > > need it who are we to tell them they don't? Obviously it has to > > round-trip with plain ole XML. -T. > > > > I think what this really is, when you strip out the concept of binary XML, > is a suggestion for a compression format tuned for markup streams. > > There are two distinct issues 1) efficiency of parsing 2) compactness. A > standard compression format for XML (ala zip,gzip etc) would be for > bandwidth limited applications. I agree. I feel they can be solved with a similar solution in at least some circumstances. Rather there are some straightforward ways to acheive compression that actually make efficiency worse while some solutions for efficiency also make compression easier. In fact there are a number of levels you could go with compression: optional gzip/bzip2 possibly preceded by: Dictionary compression (various forms of building a list of commonly used terms or all terms in the current document/stream or some combination) 'Priming' for certain circumstances. For instance, I've long thought that an ideal design for super high bandwidth circuits (TCP connection, message queue, special purpose) is to essentially start out with a raw state where you send, once per connection/conversation, all of the XML or other full self describing data (a DTD is an expression of this) and possibly even a dictionary built from past experience and then highly compress the rest of the stream based on the defined base. In some circumstances you could even have a base 'dictionary' stored on each receiver to improve short messages. Each further transaction could use all of the known information to compress in a layered way. There are plenty of circumstances where a connection is made and many messages are sent, sometimes millions per connection. I've had servers that normally handled 30-50 million messages/day. Both careful structuring of the data (a la bXML) and things like parallel inheritance delta's play into this kind of optimization. sdw > Jonathan Borden > http://jabr.ne.mediaone.net > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Fri Mar 26 19:00:19 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:39 2004 Subject: ANNOUNCE: XQL Mailing List, XQL FAQ Message-ID: <3.0.3.32.19990326140131.030c5100@pop.mindspring.com> I have just set up a mailing list for XQL (XML Query Language). This list is intended to answer questions about the definition of the language, how to implement it, who has implemented it in what products, and whatever else seems to be of interest. I will also use this list to try to reach consensus in the XQL community if decisions need to be made, eg to add new extensions. The XQL FAQ may be found here: http://metalab.unc.edu/xql/ It contains a link to the mailing list, but you can also access the mailing list directly here: http://franklin.oit.unc.edu/cgi-bin/lyris.pl?enter=xql Hope this is helpful! Jonathan Jonathan Robie R&D Fellow, Software AG jonathan.robie@sagus.com <- this address will be active Monday xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 26 19:06:10 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:39 2004 Subject: XML and (K)Office In-Reply-To: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca> Message-ID: <199903261905.OAA10928@hesketh.net> >And what, pray tell, part of a DTD helps you "make sense" of a >format? -Tim The comments, of course! (Which is a large part of why DDML provided explicit space for documentation.) Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From derekdb at microsoft.com Fri Mar 26 19:19:01 1999 From: derekdb at microsoft.com (Derek Denny-Brown) Date: Mon Jun 7 17:10:39 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <8B57882C41A0D1118F7100805F9F68B506F1BF00@RED-MSG-45> Not to be picky, but... The "Save-As" option in IE5 for XML documents _does_ save the XML. -derek -----Original Message----- From: Matthew Sergeant (EML) [mailto:Matthew.Sergeant@eml.ericsson.se] It appears that IE5 converts internally to HTML (with the XSL style sheet), so the answer is that you can't. Even a save to disk saves the HTML AFAIK. Try using Mozilla - it does things right, and displays XML+XSL remarkably well considering it's at least 6 months away from release. > -----Original Message----- > From: Kevin Hsu [SMTP:shyutz@ms1.hinet.net] > > Can anyone tell me how to print the XML document as I see on the screen in > IE 5.0, thanks in advance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Mar 26 19:30:31 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:39 2004 Subject: Maybe a naive question about XML Data Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1FD@RED-MSG-08> Peter Zinqq asked whether or not to use the XML-Data schema notation shipped with IE5. That depends on your needs and timeframes. The IE5 MSXML parser supports both DTD and XML-Data. DTDs are supported by a wider range of parsers, so you have a greater degree of interop, if replacing parsers is important to you. XML-Data uses XML syntax and supports namespaces and datatypes, if that is important to you. I expect that future MSXML parsers will continue to support notations. But that brings me to mention the work going on in the W3C: The XML schemas activity is working on defining the next generation of schema notation, and the shopping list of features includes all the features presently available from XML-Data in MS IE5, and more. (I expect that future MSXML parsers will support the future schema notation.) So a lot depends on the exact features you need and where (MSIE or other parsers) and when (now or later) you need them. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Mar 26 19:33:36 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:10:39 2004 Subject: how to print the XML document in IE 5.0 In-Reply-To: <002401be7757$f99675c0$15cd4acb@flag.com.tw> Message-ID: <000901be77bf$31695c80$5118a8c0@kuantech1.quokka.com> I am confused by the responses to this question. I selected the Print command from the File menu in IE5 final and it printed just fine. I was looking at a raw XML file with no formatting commands of any kind. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Kevin Hsu Sent: Thursday, March 25, 1999 10:55 PM To: XML Developers' List Subject: how to print the XML document in IE 5.0 Can anyone tell me how to print the XML document as I see on the screen in IE 5.0, thanks in advance. Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/45950717/attachment.htm From Mark.Birbeck at iedigital.net Fri Mar 26 20:16:58 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:10:39 2004 Subject: Maybe a naive question about XML Data Message-ID: I think also the fact that regardless of what standard is adopted the presence of namespaces and the fact that definitions can be 'open' is something well worth getting the hang of now. I'm using the IE5 stuff knowing full well that it may change, because these concepts are not present in DTDs, and so it's the only way to experiment with them. BTW, since everything that can be represented in a DTD can be represented in XML-Data, then you could just transform your XML to DTDs when it leaves your system. Then, as another writer said, you hide the specifics behind a general interface. Regards, Mark > -----Original Message----- > From: Andrew Layman > Sent: 26 March 1999 19:29 > To: 'xml-dev@ic.ac.uk' > Subject: RE: Maybe a naive question about XML Data > > > Peter Zinqq asked whether or not to use the XML-Data schema > notation shipped > with IE5. That depends on your needs and timeframes. The > IE5 MSXML parser > supports both DTD and XML-Data. DTDs are supported by a > wider range of > parsers, so you have a greater degree of interop, if > replacing parsers is > important to you. XML-Data uses XML syntax and supports > namespaces and > datatypes, if that is important to you. I expect that future > MSXML parsers > will continue to support notations. > > But that brings me to mention the work going on in the W3C: > The XML schemas > activity is working on defining the next generation of schema > notation, and > the shopping list of features includes all the features > presently available > from XML-Data in MS IE5, and more. (I expect that future > MSXML parsers will > support the future schema notation.) > > So a lot depends on the exact features you need and where > (MSIE or other > parsers) and when (now or later) you need them. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Fri Mar 26 20:45:08 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:39 2004 Subject: How about changing the rules? In-Reply-To: <001a01be77af$8a3a20c0$c8a8a8c0@thing1> Message-ID: Hi Bill, This is a very interesting model. I'll give it some thoughts. This is fresh air Bill. Thanks again Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Bill la Forge Sent: Friday, March 26, 1999 12:39 PM To: Tyler Baker; Didier PH Martin Cc: 'XML Dev' Subject: Re: How about changing the rules? I prefer a model which works like a magazine: 1. You have a central theme, say SAX2 and the MDSAX2 component model. 2. There is an annual subscription fee, as well as charges for back issues and collections. 3. Authors/programmers can have any number of arrangements: regular columns, work-for-hire contributions, royalties based on circulation of a given issue, reprints, and inclusion in collections. I've always though authors had a better deal than programmers. But with things like PCs, Java, XML, and component-based programming, there is no real reason not to make the transition. Of course, to add real value, we would want to include branding and testing into the model. Perhaps some kind of rating system. Right now, JXML, Inc., a Delaware Corporation, is "between business models". I'm doing some work for The Open Group right now, but that's it. This might be an interesting vision. We'd need to grow JXML quite a bit to do it, but I'm open to suggestions. Is this a reasonable model? How could it be improved? Any ideas on how we might best proceed? (Open Source, Open Standards, Open Business Models???) Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 20:54:10 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:39 2004 Subject: Document Schemas and Documentation (was: RE: XML and (K)Office) In-Reply-To: <199903261905.OAA10928@hesketh.net> References: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca> <199903261905.OAA10928@hesketh.net> Message-ID: <14075.57965.905043.828920@localhost.localdomain> Simon St.Laurent writes: > >And what, pray tell, part of a DTD helps you "make sense" of a > >format? -Tim > > The comments, of course! (Which is a large part of why DDML provided > explicit space for documentation.) Ain't that the truth. As I pointed out to the DDML designers a while back, though, every XML element and attribute needs three types of documentation: 1. an XML 1.0 element or attribute name (i.e. "a"); 2. a human-readable title (i.e. "Hypertext Anchor"); and 3. a proper description (probably including paragraphs, examples, tables, etc.). Some people might also add a brief, one-sentence description in-between (2) and (3). Items (2) and (3) also need to be localizable, possibly by allowing repetition coupled with xml:lang. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Unak at Level3.com Fri Mar 26 21:07:44 1999 From: Mark.Unak at Level3.com (Mark.Unak@Level3.com) Date: Mon Jun 7 17:10:39 2004 Subject: unsubscribe xml-dev Message-ID: <6DD3824BDF75D211930E0008C71EC92001B2998D@l3lsvlmail02.l3.com> unsubscribe xml-dev indiketr@churchill.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Unak at Level3.com Fri Mar 26 21:09:19 1999 From: Mark.Unak at Level3.com (Unak, Mark) Date: Mon Jun 7 17:10:39 2004 Subject: unsubscribe xml-dev Message-ID: <6DD3824BDF75D211930E0008C71EC92001B2998E@l3lsvlmail02.l3.com> unsubscribe xml-dev Mark.Unak@Level3.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.S.Brothers at EMCIns.Com Fri Mar 26 21:26:35 1999 From: Michael.S.Brothers at EMCIns.Com (Michael S. Brothers) Date: Mon Jun 7 17:10:39 2004 Subject: Megginson's Spelling In-Reply-To: <87256740.0064FE57.00@d53mta03h.boulder.ibm.com> Message-ID: On Fri, 26 Mar 1999 11:23:04 -0700 roddey@us.ibm.com wrote: > > > > >3. As Donne wrote (cited in the OED), "Busie old foole, unruly > > sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes" > > (see what I mean about spelling?). > > > > Wasn't "Pedantique" a movie where Sharon Stone walked around lightly > clothed and killed her husband because of his horrible spelling? > And, I believe she also killed her husband's best friend because of his making obscure movie references. Diabolical wrench. > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > ---------------------- Michael S. Brothers Michael.S.Brothers@EMCIns.com 515-362-7473 At this point, I don't think that's the best option. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 22:23:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:39 2004 Subject: SAX2: Proposed alternative DTD interface Message-ID: <14076.1733.365295.427943@localhost.localdomain> Here's another alternative for SAX2: forget about trying to report DTD declarations as events, and simply make the whole DTD available through an interface with a Parser2.get() call. I threw together a quick (read-only) DTD interface this morning, and uploaded it to the following location http://www.megginson.com/SAX/sax2dtd-19990326.zip The package consists of the following interfaces (and exception class) in the org.xml.sax.dtd package: Attribute extends DTDComponent ContentGroup extends ContentParticle ContentParticle ContentParticleIterator ContentToken extends ContentParticle DTD DTDComponent DTDComponentIterator DTDException extends java.lang.Exception Element extends DTDComponent Entity extends DTDComponent Notation extends DTDComponent The interface itself is pretty small -- the compiled class files add up to just over 4K -- and a SAX application would get the information like this: try { DTD dtd = (DTD)parser.get("http://xml.org/sax/props/dtd"); } catch (SAXNotSupportedException e) { // ... } This would print out the names of all of the declared elements: DTDComponentIterator it = dtd.getElements(); while (it.hasMoreMembers()) { System.out.println(((Element)(it.getNextMember())).getName()); } etc., etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 26 22:44:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:40 2004 Subject: SAX2: Proposed alternative DTD interface In-Reply-To: <14076.1733.365295.427943@localhost.localdomain> References: <14076.1733.365295.427943@localhost.localdomain> Message-ID: <14076.3465.903408.98435@localhost.localdomain> David Megginson writes: > This would print out the names of all of the declared elements: > > DTDComponentIterator it = dtd.getElements(); > while (it.hasMoreMembers()) { > System.out.println(((Element)(it.getNextMember())).getName()); > } > > etc., etc. If people find this interesting, we might want to rewrite it to use the Java 2 collection classes (and the C++ STL in a C++ port, etc). I am a little wary of forcing all users to have upgraded to JDK 1.2, but that's a separate discussion (most DTD-related work would be server-side anyway). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fmclain at cdgpd.com Fri Mar 26 23:04:45 1999 From: fmclain at cdgpd.com (Fred McLain) Date: Mon Jun 7 17:10:40 2004 Subject: Important Message From Fred McLain Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com> Here is that document you asked for ... don't show anyone else ;-) -------------- next part -------------- A non-text attachment was scrubbed... Name: list1.doc Type: application/msword Size: 40960 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/8e7376c4/list1.doc From fmclain at cdgpd.com Fri Mar 26 23:04:50 1999 From: fmclain at cdgpd.com (Fred McLain) Date: Mon Jun 7 17:10:40 2004 Subject: Important Message From Fred McLain Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com> Here is that document you asked for ... don't show anyone else ;-) -------------- next part -------------- A non-text attachment was scrubbed... Name: list1.doc Type: application/msword Size: 40960 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/b7e08a5f/list1.doc From jonathan at texcel.no Fri Mar 26 23:30:49 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:40 2004 Subject: Important Message From Fred McLain In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com> Message-ID: <3.0.3.32.19990326183147.0338ec50@pop.mindspring.com> At 03:03 PM 3/26/99 -0800, Fred McLain wrote: >Here is that document you asked for ... don't show anyone else ;-) > >Attachment Converted: "D:\pipeplus\DOWNLOAD\list1.doc" This document has macros in it - it could well contain a virus. Jonathan Jonathan Robie R&D Fellow, Software AG jonathan.robie@sagus.com <- this address will be active Monday xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Fri Mar 26 23:45:14 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:10:40 2004 Subject: virus alert!!! Re: Important Message From Fred McLain References: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com> Message-ID: <075401be77e2$6a5bfc00$a24630d1@trivida.com> Fred, This message that you just sent to the recipients below contains an attachment with an MS Word Macro virus. Earlier today I got the same thing from someone else. Apparently, the macros installed by the virus when you open the attachment sends the 'Important message' to everyone in your address book or contact list, thus spreading it pretty fast. It would be good if you warned your correspondents not to open the attachment. Apparently the virus tries to run MS Outlook to re-distribute itself; I was lucky (I hope) and all attempts to send it onward failed (with error message box) since I don't have Outlook properly installed and don't use it. Jeff ----- Original Message ----- From: Fred McLain To: ; Sent: Friday, March 26, 1999 3:03 PM Subject: Important Message From Fred McLain > Here is that document you asked for ... don't show anyone else ;-) > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Fri Mar 26 23:54:43 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:10:40 2004 Subject: What McAfee says about new Word Macro virus (I've received it twice today already!!) Message-ID: <076a01be77e3$b399b870$a24630d1@trivida.com> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 43 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/4f6b4f9d/attachment.gif From richard at cogsci.ed.ac.uk Sat Mar 27 00:02:23 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:10:40 2004 Subject: virus alert!!! Re: Important Message From Fred McLain In-Reply-To: Jeff Greif's message of Fri, 26 Mar 1999 15:43:14 -0800 Message-ID: <15359.199903270001@doyle.cogsci.ed.ac.uk> > It would be > good if you warned your correspondents not to open the attachment. This suggests that the message was sent in good faith, something I find hard to believe. People sending genuine messages to mailing lists don't say "Here is that document you asked for ... don't show anyone else". I've mailed abuse@cdgpd.com but for all I know they are spammers themselves. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fmclain at cdgpd.com Sat Mar 27 00:10:26 1999 From: fmclain at cdgpd.com (Fred McLain) Date: Mon Jun 7 17:10:40 2004 Subject: Virus in my last e-mail Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com> Folks, The last e-mail I sent had a virus in the attached word document. PLEASE don't open the document. In our office it caused Outlook 98 to autosend itself to everyone on our address lists, turned off virus checking in word (tools/options/general/macro virus protection), and modified the default template normal.dot. Sorry! <> -------------- next part -------------- A non-text attachment was scrubbed... Name: Fred McLain.vcf Type: application/octet-stream Size: 420 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990327/ac21d9d8/FredMcLain.obj From jgarrett at navix.net Sat Mar 27 00:17:43 1999 From: jgarrett at navix.net (Jim Garrett) Date: Mon Jun 7 17:10:40 2004 Subject: Important Message From Fred McLain In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com> Message-ID: <000601be77e4$fe30c350$58c8c8c8@jgp400> Fred: Please convert your Word Doc file w/ Macros into HTML so we can view what you don't want us to see... Thanks jg |-----Original Message----- |From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of |Fred McLain |Sent: Friday, March 26, 1999 5:04 PM |To: 'dkrylov@cgxpress.com'; 'xml-dev@ic.ac.uk' |Subject: Important Message From Fred McLain | | |Here is that document you asked for ... don't show anyone else ;-) | | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremy at omsys.com Sat Mar 27 00:37:07 1999 From: jeremy at omsys.com (Jeremy H. Griffith) Date: Mon Jun 7 17:10:40 2004 Subject: virus alert!!! Re: Important Message From Fred McLain In-Reply-To: <15359.199903270001@doyle.cogsci.ed.ac.uk> References: <15359.199903270001@doyle.cogsci.ed.ac.uk> Message-ID: <373926a9.446264444@smtp.omsys.com> On Sat, 27 Mar 1999 00:01:43 GMT, Richard Tobin wrote: >> It would be >> good if you warned your correspondents not to open the attachment. > >This suggests that the message was sent in good faith, something I >find hard to believe. People sending genuine messages to mailing >lists don't say "Here is that document you asked for ... don't show >anyone else". I've mailed abuse@cdgpd.com but for all I know they >are spammers themselves. Yeesh. The *worm* sent the message, just like with happy99. All poor Fred did was open it, which in some mailers is automatic... That is the nature of a worm; it sends itself on. Get used to it. And make sure you never, ever, load a Word/PowerPoint/Excel doc in such a way that the Auto macros can run... which is real hard to avoid... --Jeremy H. Griffith http://www.omsys.com/jeremy/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Sat Mar 27 00:47:12 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:10:40 2004 Subject: virus alert!!! Re: Important Message From Fred McLain In-Reply-To: Jeremy H. Griffith's message of Sat, 27 Mar 1999 00:37:34 GMT Message-ID: <15392.199903270046@doyle.cogsci.ed.ac.uk> > Yeesh. The *worm* sent the message, just like with happy99. All > poor Fred did was open it, which in some mailers is automatic... > That is the nature of a worm; it sends itself on. Get used to it. Fortunately I don't have to, I don't use MS Windows :-) I hadn't realised that there was a way for Windows viruses to find what mailing lists you used. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Sat Mar 27 01:16:55 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:10:40 2004 Subject: XML and (K)Office Message-ID: I think the conflict is caused by the concept of a valid document as opposed to parsing according to a DTD. Validity introduced a useful concept, but perhaps we should divorce it from parsing. A 'valid' document meets the requirements of a DTD. When we talk about not needing to validate a document, we are assuming that it has already been validated when it was created so why waste the time doing it again. Perhaps another way to view it is to say that a document has been certified against a particular specification. Currrently, this specification is a DTD. But what if the specification were more abstract, say a URI? As with namespaces, there may be an agreed DTD associated with the URI (the agreement is human convention) or the specification could be non-DTD based (this document conforms to IRS/1999/ScheduleD). Applications that produce or consume documents may use DTDs or any other form to describe agreed structure. This would separate validity from parsing according to a DTD - validity is certification of conformance to whatever a URI has been agreed to represent. Validity is then not a method of parsing but a certificate of conformance. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Tim Bray [SMTP:tbray@textuality.com] Sent: Friday, March 26, 1999 9:52 AM To: James Robertson; XML Developers' List Subject: RE: XML and (K)Office At 09:20 AM 3/26/99 +1000, James Robertson wrote: >Without the rigour of a DTD, we've got nothing. This sentiment is not universally shared. While DTDs are extremely useful and should be constructed as (a small) part of any serious language-design effort, they are in some cases unnecessary (for validation, full-text indexing, and lots of other things) and in other cases insufficient - DTD validation never comes close to real business-logic validation. I am near-schizophrenic these days, running around telling people that yes, they should use DTDs, and simultaneously warning them that there are situations where they fail to be either necessary or sufficient; the kind of mystico- religious attitude above does not help. >How will future users make sense of the format without >a DTD? And what, pray tell, part of a DTD helps you "make sense" of a format? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jgarrett at navix.net Sat Mar 27 01:45:07 1999 From: jgarrett at navix.net (Jim Garrett) Date: Mon Jun 7 17:10:40 2004 Subject: Virus in my last e-mail - Fred's attached Outlook profile ?? In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com> Message-ID: <000001be77f2$afe6ebd0$58c8c8c8@jgp400> Can you attached Outlook profile execute VIRUS macro's... How do we know that "that" doesn't also contain a Virus...?? |-----Original Message----- |From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of |Fred McLain |Sent: Friday, March 26, 1999 6:09 PM |To: 'xml-dev@ic.ac.uk'; 'dkrylov@cgxpress.com' |Subject: Virus in my last e-mail | | |Folks, | |The last e-mail I sent had a virus in the attached word document. PLEASE |don't open the document. In our office it caused Outlook 98 to autosend |itself to everyone on our address lists, turned off virus checking in word |(tools/options/general/macro virus protection), and modified the default |template normal.dot. | |Sorry! | | | <> | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Sat Mar 27 01:52:22 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:41 2004 Subject: virus alert!!! Re: Important Message From Fred McLain References: <15392.199903270046@doyle.cogsci.ed.ac.uk> Message-ID: <36FC41B9.D2BB120A@lig.net> I would outlaw the sending and receipt of all MS document formats if a company really wanted security. Even a .txt file, if it contains "rich text" and you use Word as your default viewer, can contain a windows binary that can be executed. I've received zillions of these in the last several years and many other trojan horses. I don't even look at Word/Excel, etc. documents from people I don't know. It really is unbelievable how little security MS designed software has. Viva XML.... And Java. sdw Richard Tobin wrote: > > Yeesh. The *worm* sent the message, just like with happy99. All > > poor Fred did was open it, which in some mailers is automatic... > > That is the nature of a worm; it sends itself on. Get used to it. > > Fortunately I don't have to, I don't use MS Windows :-) > > I hadn't realised that there was a way for Windows viruses to find > what mailing lists you used. > > -- Richard > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Sat Mar 27 02:27:26 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:10:41 2004 Subject: Whither XQL again. Message-ID: <30649320C177D111ADEC00A024E9F297169FC7@exchange-server.dega.com> All, Thanks for your help in my understanding the nature of XML query language3s in general and XQL in particular. Thanks a lot to Jonathan for his insights and help. I've put up a web site with what I have done so far. Its not much, but over the weekend, I hope to accomplish a lot. The parser compiles but only recognizes path expressions so far. Statements like: 'novel/front' and 'novel//title', compile and generate the correct ASTs. But this is just a smidgen amount. Anyway, please help if you can or have the time. the site is: http://ed.dega.com/pub/xml/xql/index.html Thanks. Ed Ed Howland ed@dega.com http://www.dega.com Alpha Geek and XML TV Evangelist. "Seek to be well formed, lest you incur the wrath of the W3C!" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sat Mar 27 02:39:33 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:10:41 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: Your message of "Thu, 25 Mar 1999 16:35:43 EST." Message-ID: <199903270239.TAA08007@malatesta.local> > > public interface DTDDeclHandler > > { > > public final static int ATTRIBUTE_DEFAULTED = 1; > > public final static int ATTRIBUTE_IMPLIED = 2; > > public final static int ATTRIBUTE_REQUIRED = 3; > > public final static int ATTRIBUTE_FIXED = 4; > > > > How committed are you to using integer constants? I know this is common, > but it tends to lend itself to bad code. Some people prefer a solution > like this: > > public class AttributStatus { > > public final static AttributeStatus ATTRIBUTE_DEFAULTED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_IMPLIED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_FIXED = > new AttributeStatus(); > public final static AttributeStatus ATTRIBUTE_REQUIRED = > new AttributeStatus(); > > private AttributeStatus() {} > > } > > This creates four menmonic constants you want and gives them a checkable > type. New constants can't be created because of the private constructor. > And there's no chance that anybody's going to write code like > > if (getAttributeStatus() == 1) { > doSomething(); > } > > Programmers are more or less forced to use the constants. What do you > think? I personally take a very dim view of systems trying to "force" programmers into intrinsically good practices. Programmers can abuse any system you present, and at some point you have to accept that they are adults, and must be free to cut off their own noses if they wish. The good programming practice of replacing "magic numbers" with descriptive constants is even older than the structured programming movement, and any programmer who writes if (getAttributeStatus() == 1) { doSomething(); } when if (getAttributeStatus() == ATTRIBUTE_DEFAULTED ) { doSomething(); } Fully deserves his own bugs, or roasting at the next code-review. Furthermore, I've been thinking of proposing that the SAX2 interfaces be specified in IDL rather than Java (or at least publishing an IDL translatiuon when the interfaces are stabilized), and your proposal wouldn't wash in IDL. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Mar 27 04:38:07 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:41 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> From: Stephen D. Williams >> Tim Bray wrote: >> There are two distinct issues 1) efficiency of parsing 2) compactness. A >> standard compression format for XML (ala zip,gzip etc) would be for >> bandwidth limited applications. Someone at ITU (International Telegraph Union) was working on an ASN.1 compression of XML markup. I think they may have opted for the WAP method, for compatability. (I think the use of ASN.1 means fixed DTDs.) I have done a few tests on how much compacter forms of XML (e.g. shortrefs) impact arrival characteristics of document packet-groups under TCP/IP compared to compression. If your packet size is small, and you really need to get at data in the first packet (so that you can piggy back request for auto-linked resources in with the ACK for the first packet group), then more compact forms of markup may make a difference. But in general, compression is more effective. (It also depends on where the bottlenecks are in your data path.) One trivial way to minimise file sizes for transmission is to collapse white-space inside markup (e.g. [\ \t \n\ r]+ becomes [\n]), to make sure that newlines are not CR LF pairs, and to minimize whitespace in data: (removing trailing spaces, [\ \t]+\n) becomes [\n], is a safe transformation, for example.) And select your element and attribute names so that their length is inverse to their frequency, as much as possible: so use "a:s" not "abracadabra:shazamarama" (you may even make two versions of your DTD: an authoring one and a transmission one.) One pof the main bottleneck on many SOHO systems is the modem speed: reducing the end-to-end character count means fewer packets, and more data arrives earlier, so more auto-links are followed earlier. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Mar 27 08:20:03 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:41 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <3.0.32.19990326114522.00e927e8@pop.intergate.bc.ca> At 11:31 AM 3/26/99 -0800, Jeffrey E. Sussna wrote: >>>> I am confused by the responses to this question. I selected the Print command from the File menu in IE5 final and it printed just fine. I was looking at a raw XML file with no formatting commands of any kind. <<<< Maybe there's a way to do it, but once you bring a CSS stylesheet into play, you apparently lose the ability to print. So far everyone I know who's tried it reports this. Which makes XML+CSS essentially unusable in IE5, but maybe that's just because I have an old-fashioned regard for the printed word. -T. >>>> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Sat Mar 27 10:04:45 1999 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:10:41 2004 Subject: LISTADMIN: No attachments to list messages PLEASE In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com> Message-ID: > This message is in MIME format. Since your mail reader does not understand > this format, some or all of this message may not be legible. > > ------_=_NextPart_000_01BE77DC.DA4DA186 > Content-Type: text/plain > > Here is that document you asked for ... don't show anyone else ;-) > > > ------_=_NextPart_000_01BE77DC.DA4DA186 > Content-Type: application/msword; Regarding the above message, I must say most strongly that attaching enclosures to list postings is HIGHLY discouraged (not to mention asking them not to show it to anyone else!). Apart from the risk of a virus, it also means everyone on the list has to suffer the inconvenience of downloading a document they might not want, and in many cases might not be able to read (Unix etc). I must say here that in future, anyone attaching a document to a posting may be unsubscribed from the list without warning. This includes the visiting card (vcf) attachments, which are also considered bad list etiquette. If you do wish to bring a document to the attention of the list, then place it on an ftp/http server somewhere so that anyone interested can download it themselves (the pull rather than the push option). Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax) http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Sat Mar 27 10:25:10 1999 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:10:41 2004 Subject: LISTADMIN: PLEASE read the unsubscribe instructions!! Message-ID: Too many subscribers to the list are NOT READING the instructions in the signature, and posting to the list itself to unsubscribe. I have made these instructions as clear as possible, and there is no excuse for not following them! I will continue to "name and shame", since such list pollution is in no-one's interests. I might also add that requests of the type unsubscribe xml-dev indiketr@churchill.co.uk have to be individually moderated by me, and I do not guarantee that this will be done immediately (especially when I am away at a conference as I have been just recently). Such requests can take up to a week to process since I do them in batches > From: Mark.Unak@Level3.com > To: xml-dev@ic.ac.uk > Subject: unsubscribe xml-dev > Date: Fri, 26 Mar 1999 14:00:52 -0700 > MIME-Version: 1.0 > Sender: owner-xml-dev@ic.ac.uk > Precedence: bulk > Reply-To: Mark.Unak@Level3.com > Status: U > > unsubscribe xml-dev indiketr@churchill.co.uk > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax) http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Sat Mar 27 10:41:34 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:41 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF204@RED-MSG-08> Kevin Hsu asked how to print XML from MS IE5. To print what is displayed on the screen, select File/Print. To print the underlying XML (before application of style sheets) select View/Source and then print that using File/Print. I hope this is helpful, Andrew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sat Mar 27 12:14:32 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:41 2004 Subject: Is there anyone working on a binary version of XML? References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <36FCCBAE.6AF9636B@toolsmiths.se> Rick Jelliffe wrote: > I have done a few tests on how much compacter forms of XML (e.g. > shortrefs) impact arrival characteristics of document packet-groups > under TCP/IP compared to compression. If your packet size is small, and > you really need to get at data in the first packet (so that you can > piggy back request for auto-linked resources in with the ACK for the > first packet group), then more compact forms of markup may make a > difference. But in general, compression is more effective. (It also > depends on where the bottlenecks are in your data path.) It seems that there are more use-cases which should benefit from having a compressed or a binary format. I made some tests using following XML data. ... The resulting sizes was: XML 602830 (Standard XML text) FML 131143 (Fast ML, a binary ml that Im working on) XML.gz 75528 (gzip'ed XML text using -9 as compression rate) FML.gz 20886 (gzip'ed Fast ML using -9 as compression rate) The facinating result here is the dramatic reduction in size obtained by first converting to FML and the GZIP the markup stream. > And select your element and attribute > names so that their length is inverse to their frequency, as much as > possible: so use "a:s" not "abracadabra:shazamarama" (you may even make > two versions of your DTD: an authoring one and a transmission one.) One > pof the main bottleneck on many SOHO systems is the modem speed: > reducing the end-to-end character count means fewer packets, and more > data arrives earlier, so more auto-links are followed earlier. On the other hand there is a big drawback using "manual tag compression" which is Readability. /Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sat Mar 27 12:28:07 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:41 2004 Subject: Is there anyone working on a binary version of XML? References: <002201be77b3$39760600$1b19da18@ne.mediaone.net> Message-ID: <36FCCEDB.FCD11117@toolsmiths.se> Jonathan Borden wrote: > I think what this really is, when you strip out the concept of binary XML, > is a suggestion for a compression format tuned for markup streams. > > There are two distinct issues 1) efficiency of parsing 2) compactness. A > standard compression format for XML (ala zip,gzip etc) would be for > bandwidth limited applications. I would like to add one more issue which is 3) Complexity. Writing effecient and compact parsers is considerable simpler for binary ML, The primary reason for this is that the parser does not have to "look" for and interpret tokens in a stream. All tokens/parts in a Binary ML are well known and their sizes are easily derived from the stream. A "normally" trained and educated programmer can easily write a" fairly complete" parser in less than a week. I have'nt written an XML text parser myself yet but it seems that its not a task form the faint hearted. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 12:30:09 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:41 2004 Subject: DTD Catalogs In-Reply-To: <85256740.0060BBCE.00@vgi4mail.vanguard.com> References: <85256740.0060BBCE.00@vgi4mail.vanguard.com> Message-ID: * Paul Tihansky | | Does anybody know if any of the Java XML Parsers support catalog | files? See: | For instance, if I put a Public Indentifier in my DTD declaration | without a URL, how would a parser such as XP find the DTD? You could use SAX and register an EntityResolver. Those who support it can be tailored through the EntityResolver. | How do I specify where the parser can find the catalog file? That varies. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sat Mar 27 12:41:09 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:41 2004 Subject: Is there anyone working on a binary version of XML? References: <002201be77b3$39760600$1b19da18@ne.mediaone.net> <36FBDB92.56336820@lig.net> Message-ID: <36FCD1E2.8727F1C@toolsmiths.se> "Stephen D. Williams" wrote: > I agree. I feel they can be solved with a similar solution in at least some circumstances. > Rather there are some straightforward ways to acheive compression that actually make > efficiency worse while some solutions for efficiency also make compression easier. > > In fact there are a number of levels you could go with compression: > > optional gzip/bzip2 possibly preceded by: For small to medium size streams will the gzip/bzip2 step probably take longer time to complete than the savings in communications time. Of cource this also depends on the network speed. > > Dictionary compression (various forms of building a list of commonly used terms or all terms > in the current document/stream or some combination) This is probably the best first action to take when needing to compress a ML stream. Its also possible to combine Dictionaries with "Sessions". ie: two communication nodes could establish a Session which contains pre negotiated Dictionaries, which means that Dictionary content have to be sent over the wire only once. All "Packets" thereafter references the dictionaries. This is what I do in FML , however I have no estimates of how much space is actualy saved. /Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 12:52:12 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:41 2004 Subject: Why doesn't XML have Bag? In-Reply-To: <36FBA3DC.59DA@skynet.be> References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> <36FBA3DC.59DA@skynet.be> Message-ID: * Lars Marius Garshol | | It is a limitation of DTDs and was introduced because without this | operator element content models are easily mapped to finite state | automatons, but the introduction of the '&' separator makes automaton | generation much more difficult. * Paul Janssens | | Please correct me if I am wrong here but isn't that trivial? | (you may get a BIG automaton, but it's not difficult to generate) This is correct. However, the number of states required for n elements is n! with this approach (ie: worse than exponential), which means that the automaton doesn't just get BIG, for reasonably sized content models it can get ABSOLUTELY MIND-BOGGLINGLY AWFULLY STUNNINGLY HUGE. :) To wit: [24]> (! 10) 3628800 [25]> (! 30) 265252859812191058636308480000000 In other words, this approach doesn't work at all. | and it's 'easily' visualised by the number of possible shortest | paths between two opposing points on a hypercube. I can think of easier ways of visualising it, though possibly not 'easier' ones. :) --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shyutz at ms1.hinet.net Sat Mar 27 13:21:03 1999 From: shyutz at ms1.hinet.net (Kevin Hsu) Date: Mon Jun 7 17:10:41 2004 Subject: how to print the XML document in IE 5.0 References: <000901be77bf$31695c80$5118a8c0@kuantech1.quokka.com> Message-ID: <005201be7851$157f4920$15cd4acb@flag.com.tw> >I am confused by the responses to this question. I selected the Print command from the File menu in >IE5 final and it printed just fine. I was looking at a raw XML file with no formatting commands of >any kind. > >Jeff if you have XML document with no XSL , it will print the raw XML document with default stylesheets of IE 5.0 , and it will be like a tree view of XML document. I can print the XML with XSL style sheet , but never print well with CSS stylesheets, try to print the document URL below: http://www.xml.com/1999/03/ie5/first-x.xml Kevin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 27 13:59:15 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:42 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <007a01be785a$baf28bc0$c8a8a8c0@thing1> From: Anders W. Tell >Its also possible to combine Dictionaries with "Sessions". ie: two communication >nodes could establish a Session which contains pre negotiated Dictionaries, which >means that Dictionary content have to be sent over the wire only once. All "Packets" >thereafter references the dictionaries. >This is what I do in FML , however I have no estimates of how much space is actualy saved. If there is a DTD referenced by a document, then that could be used as the dictionary. Just assign a number for the first occurance of each element or attribute name found in the DTD. Yes, this is incomplete, but its sure better than using short names. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 14:04:46 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:42 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <199903270239.TAA08007@malatesta.local> References: <199903270239.TAA08007@malatesta.local> Message-ID: * uche ogbuji | | Furthermore, I've been thinking of proposing that the SAX2 | interfaces be specified in IDL rather than Java (or at least | publishing an IDL translatiuon when the interfaces are stabilized), | and your proposal wouldn't wash in IDL. Many things in SAX won't wash in IDL, such as the use of the Java-specific InputStream, Reader and Locale objects. Also, IDL has a problem in that it's sort of a least common denominator, and thus leaves out many useful language-specific things. So you'd probably want to do a manual translation anyway. If there ever is a published SAX spec I think it should use IDL to be politically correct and point out potential language-mapping problems. However, the actual utility of IDL I think is low in this particular case. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Sat Mar 27 15:28:24 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:42 2004 Subject: Is there anyone working on a binary version of XML? References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <36FCCBAE.6AF9636B@toolsmiths.se> Message-ID: <36FD00FB.2C6283E@lig.net> Can you make available what you are working on? That was the reason for my starting this thread in the first place. I'm happy to build upon or learn from existing designs... It appears you have had many of the same conclusions that I have. Let's pool our design features and make something that can be used as a common solution Thanks sdw "Anders W. Tell" wrote: > Rick Jelliffe wrote: > > > I have done a few tests on how much compacter forms of XML (e.g. > > shortrefs) impact arrival characteristics of document packet-groups > > under TCP/IP compared to compression. If your packet size is small, and > > you really need to get at data in the first packet (so that you can > > piggy back request for auto-linked resources in with the ACK for the > > first packet group), then more compact forms of markup may make a > > difference. But in general, compression is more effective. (It also > > depends on where the bottlenecks are in your data path.) > > It seems that there are more use-cases which should benefit from having a > compressed or a binary format. > > I made some tests using following XML data. > > > > ... > > > The resulting sizes was: > XML 602830 (Standard XML text) > FML 131143 (Fast ML, a binary ml that Im working on) > XML.gz 75528 (gzip'ed XML text using -9 as compression rate) > FML.gz 20886 (gzip'ed Fast ML using -9 as compression rate) > > The facinating result here is the dramatic reduction in size obtained by first > converting to FML and the GZIP the markup stream. > > > And select your element and attribute > > names so that their length is inverse to their frequency, as much as > > possible: so use "a:s" not "abracadabra:shazamarama" (you may even make > > two versions of your DTD: an authoring one and a transmission one.) One > > pof the main bottleneck on many SOHO systems is the modem speed: > > reducing the end-to-end character count means fewer packets, and more > > data arrives earlier, so more auto-links are followed earlier. > > On the other hand there is a big drawback using "manual tag compression" > which is Readability. > > /Anders > -- > /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > / Financial Toolsmiths AB / > / Anders W. Tell / > /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 27 15:56:18 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:42 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14076.63486.939606.141332@localhost.localdomain> Rick Jelliffe writes: > One trivial way to minimise file sizes for transmission is to > collapse white-space inside markup (e.g. [\ \t \n\ r]+ becomes > [\n]), Yes, that might be helpful (but only minimally in most cases). > sure that newlines are not CR LF pairs, Yes, that will make a small difference. You might get a bigger bang by doing some quick analysis to determine which character encoding will provide the smallest object size: UTF-8, ISO-8859-1, UTF-16, etc. (mileage will vary depending on the languages used in the text). > and to minimize whitespace in data: (removing trailing spaces, [\ > \t]+\n) becomes [\n], is a safe transformation, for example.) No. It might be a safe transformation for specific XML formats, but not for XML in general, because you don't know what people might be using that whitespace for. In general, though, what we need is a transport layer that takes care of things like this for us. Document type designers should optimise for readability and usability, and let protocol designers worry about the optimisations. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 27 16:08:46 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:42 2004 Subject: LISTADMIN: No attachments to list messages PLEASE In-Reply-To: References: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com> Message-ID: <14077.67.346287.353886@localhost.localdomain> Rzepa, Henry writes: > Regarding the above message, I must say most strongly that > attaching enclosures to list postings is HIGHLY discouraged (not to > mention asking them not to show it to anyone else!). Apart from > the risk of a virus, it also means everyone on the list has to > suffer the inconvenience of downloading a document they might not > want, and in many cases might not be able to read (Unix etc). As became clear in the follow-ups, the posting was done by a worm that hides in Word macros (the Internet's equivalent of animal dung, apparently) exploits gaping security holes in Outlook to mail itself out to everyone in a person's address list. In other words, the original poster did *not* post the attachment to xml-dev, the worm did. His only mistakes were (a) using Microsoft Windows, (b) opening a file in MS Word, and (c) not uninstalling Outlook from his computer the first time he booted up. If you had summarily unsubscribed him, then you would simply have added an unjust punishment to the embarrassment he was already suffering. In fact, all three of the mistakes were probably mandated by company policy; if so the true blame belongs in three places, in diminishing order of culpability: 1. The poster's company, for ignoring the importance of technical diversity and mandating the same operating system and software for everyone (it's much easier to write a worm or virus when everyone's using exactly the same software). 2. Redmond, for ignoring security whenever possible. 3. The creator of the worm. If I'm right about corporate policy, then most of the blame goes to the company -- Redmond just wants to sell software, and the worm creator just wants attention, but the company failed to act in its own self-interest. Technical diversity is critical for good operation: I'd no more want to see an all-Linux shop than I'd want to see an all-Windows or an all-Mac shop. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sat Mar 27 17:34:49 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:42 2004 Subject: Is there anyone working on a binary version of XML? References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain> Message-ID: <36FD16C2.B386B0F1@toolsmiths.se> David Megginson wrote: > In general, though, what we need is a transport layer that takes care > of things like this for us. Document type designers should optimise > for readability and usability, and let protocol designers worry about > the optimisations. In general I agree especially with that Document Type creator should not be doing "manual optimizations". However I in the case of DOM to DOM communication its possible to to much better than using XML text a content carrier. In this case the question arises, where is the protocol interface ? Is it an interface that accepts a arbitrary byte stream and transports opaque data to a receiver (A) or is it a an interface that accepts a DOM tree and sends it ot a receiver (B) ? (A) DOM --> [streamifyXML] --> XML text -->[protocolInterface] -->something smaller,faster (gzip, FastML,...) ==communicate==> something smaller,faster --> [protocolInterface] --> XML text ->[SAX] --> DOM (B) DOM --> [protocolInterface] -->something smaller (gzip) ==communicate==> something smaller --> [protocolInterface] -->DOM /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sat Mar 27 17:58:42 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:42 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <36FD16C2.B386B0F1@toolsmiths.se> Message-ID: <003201be787a$8d4e1340$1b19da18@ne.mediaone.net> Anders W. Tell wrote: ...in the case of DOM to DOM > communication its possible to to much better than using XML text > a content carrier. > > In this case the question arises, where is the protocol interface ? > Is it an interface that accepts a arbitrary byte stream and > transports opaque data > to a receiver (A) or is it a an interface that accepts a DOM tree > and sends it > ot a receiver (B) ? > the protocol layer would be HTTP,SMTP etc. These protocols employ MIME. The process of content negotiation might provide something like: Content-type: application/xml; encoding="compressed-xml" I'm not sure about what DOM to DOM communication means, the DOM doesn't currently have any standard methods to even create XML documents, let alone provide communications support. Most parsers accept an href. You could propose a new protocol such as "x-xml:..." but better might be to request your encoding type in the HTTP request Accept: header and then check the response content-type. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 18:02:58 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:42 2004 Subject: SAX2: Proposed alternative DTD interface In-Reply-To: <14076.1733.365295.427943@localhost.localdomain> References: <14076.1733.365295.427943@localhost.localdomain> Message-ID: * David Megginson | | Here's another alternative for SAX2: forget about trying to report | DTD declarations as events, and simply make the whole DTD available | through an interface with a Parser2.get() call. I'm against this. Having an event-based/object-based dichotomy makes sense for DTDs just as it does for document instances. Also, this breaks with the rest of SAX, is relatively complex and will at some point probably be in direct competition with the DOM Level X. Parsers that already have an internal object representation of the DTD will need to wrap that with this interface, which probably won't be a a very nice job, while an adapter for the event-based interface should be simple. Furthermore, this can be built on top of a 100% event-based SAX2. And, finally, I dislike the iterators. They are just a nuisance in higher-level languages, and a plain array would probably be better. It would also free us from all this casting. I say do it, but on top of an event-based interface, outside of the SAX2 core and preferably without the iterators. A nice addition might be support for content model automatons. (Three methods are needed: get_start_state, get_next_state and is_final_state.) --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sat Mar 27 18:07:47 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:10:42 2004 Subject: SAX2: Proposed alternative DTD interface Message-ID: <45ea762.36fd1e0e@aol.com> Hi David, In a message dated 3/26/99 5:31:10 PM Eastern Standard Time, david@megginson.com writes: > Here's another alternative for SAX2: forget about trying to report DTD > declarations as events, and simply make the whole DTD available > through an interface with a Parser2.get() call. > Although most DTDs will be short, it seems that the event-based interface will still be beneficial for large DTDs and small-footprint applications that cannot afford the memory of receiving the entire DTD implementation object. I think the best alternative is to allow both options, and you just don't set a handler if you want to ignore the events. Which leads me back to my wish list for... try { Document doc = (Document)parser.get("http://xml.org/sax/props/dom"); } catch (SAXNotSupportedException e) { // ... } Which follows from the same logic. Sometimes you want an event-based interface and sometimes you just want the resulting object -- a Simple API for XML should cover both cases. Best wishes, - Mike { www.gosynergy.com } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 18:27:56 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:42 2004 Subject: Fast filter support in SAX2 In-Reply-To: <009201be7790$c0a6b7a0$c8a8a8c0@thing1> References: <009201be7790$c0a6b7a0$c8a8a8c0@thing1> Message-ID: * Bill la Forge | | I'd like to suggest another method in Parser2: | | public String unique(String); | | as well as a featureID for requesting unique element and attribute | names. Bill, is this meant to be an interface to the string interning scheme of the parser? If so, maybe we should call it intern? Anyway, if that's what it is I support it. I'm a bit unsure why you think the unique method is needed, though. What kinds of uses do you have in mind for it? | If a parser supports both the unique feature and provides access to | its element stack, Hmmm. I think this should be skipped. We'll need a special interface to represent the stack, and parsers will probably have to do some internal juggling to weed out information from the internal stack that's only for internal use (and to adapt it to the SAX2 interface). I think the result will be lower performance than if the application maintained its own element stack. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Sat Mar 27 18:45:44 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:42 2004 Subject: DOM: notations and unparsed entities Message-ID: <01BE788A.49DEA2E0@grappa.ito.tu-darmstadt.de> 1) How do I determine if an attribute's value is a notation or unparsed entity? In the case of an unparsed entity, I'm guessing that the Attr node has an EntityReference child (is this true?). Notations have me stumped. 2) Is there a general DOM mailing list? The only one I could find was www-dom@w3.org, which I assumed was for spec comments, not questions like this. Thanks, -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 27 19:16:02 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:42 2004 Subject: Fast filter support in SAX2 Message-ID: <001601be7886$f9db4440$c8a8a8c0@thing1> From: Lars Marius Garshol >| I'd like to suggest another method in Parser2: >| >| public String unique(String); >| >| as well as a featureID for requesting unique element and attribute >| names. > >Bill, is this meant to be an interface to the string interning scheme >of the parser? If so, maybe we should call it intern? > >Anyway, if that's what it is I support it. I'm a bit unsure why you >think the unique method is needed, though. What kinds of uses do you >have in mind for it? It would be great if filters had the same advantages as parsers in being able to simply test for equality (x==y) rather than having to do a string comparison (x.equals(y)) when checking for a specific element or attribute name. >From previous discussion on this list, I gathered that many parsers did the equivalent of String.intern(), but avoided the JavaSoft implementation for extra speed. If this is the case, then a filter needs to use the parser's own intern function to preprocess its constants before testing for matches in the startElement events. So the short answer is yes, intern is beter than unique. I should have checked the lang package first. >| If a parser supports both the unique feature and provides access to >| its element stack, > >Hmmm. I think this should be skipped. We'll need a special interface >to represent the stack, and parsers will probably have to do some >internal juggling to weed out information from the internal stack >that's only for internal use (and to adapt it to the SAX2 interface). > >I think the result will be lower performance than if the application >maintained its own element stack. When you are working with filter structures, it is difficult to say where the parser ends and the application begins. You raise an implementation issue that there should be a separate stack that is accessable, distinct from the one used by the parser. My interest here is, instead, to define a means for sharing the element stack across independently developed filters. Just about every filter which does anything interesting ends up implementing its own element stack. Why not have one filter that does that, and let the rest get it from their "parser". (Think of parser as a role, a source of events relative to a particular event consumer, not an implementation. The confusion here comes from giving the interface the name Parser or Parser2, when it can be either the actual parser or just another filter.) Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Mar 27 19:32:43 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:42 2004 Subject: Fast filter support in SAX2 In-Reply-To: <001601be7886$f9db4440$c8a8a8c0@thing1> References: <001601be7886$f9db4440$c8a8a8c0@thing1> Message-ID: * Lars Marius Garshol | | I'm a bit unsure why you think the unique method is needed, | though. What kinds of uses do you have in mind for it? * Bill la Forge | | From previous discussion on this list, I gathered that many parsers | did the equivalent of String.intern(), but avoided the JavaSoft | implementation for extra speed. If this is the case, then a filter | needs to use the parser's own intern function to preprocess its | constants before testing for matches in the startElement events. Ah, I thinking didn't think of that. Now that I've had some more time to think about this I realize that this would also be useful for filters that create new names, such as XAF. | So the short answer is yes, intern is beter than unique. I should | have checked the lang package first. This terminology is also used in Common Lisp and Python, and probably many other places as well. | My interest here is, instead, to define a means for sharing the | element stack across independently developed filters. Just about | every filter which does anything interesting ends up implementing | its own element stack. Why not have one filter that does that, and | let the rest get it from their "parser". Another good point, and a very good idea, too. However, then we need to define the element stack interface and what should be included there. Just the elements? Elements and attributes? Elements, attributes and sibling number? Which entity each element comes from? Maybe this should be done outside the SAX core? On the other hand, if filters are included I think this should be too. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 27 20:11:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:42 2004 Subject: SAX2: Proposed alternative DTD interface In-Reply-To: <45ea762.36fd1e0e@aol.com> References: <45ea762.36fd1e0e@aol.com> Message-ID: <14077.14778.566156.825039@localhost.localdomain> MikeDacon@aol.com writes: > Although most DTDs will be short, it seems that the event-based > interface will still be beneficial for large DTDs and > small-footprint applications that cannot afford the memory of > receiving the entire DTD implementation object. It's worthwhile, perhaps, to ask whether there will be many XML applications that a) require a small footprint; b) need DTD information; and c) can use the information in a streaming format. Any kind of DTD-driven editing tool needs to store the DTD in some kind of a persistent structure, and I imagine that most XML processing on small clients will not worry much about DTDs at all. To continue playing devil's advocate (since I don't really know which alternative I prefer), I'll also point out that even the largest DTDs, like TEI or DocBook, would measure their memory requirements in kilobytes rather than megabytes; and even-based API makes sense for the document itself because there is not known limit to an XML document's size. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 27 20:13:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:42 2004 Subject: Fast filter support in SAX2 In-Reply-To: <001601be7886$f9db4440$c8a8a8c0@thing1> References: <001601be7886$f9db4440$c8a8a8c0@thing1> Message-ID: <14077.15323.327681.132673@localhost.localdomain> Bill la Forge writes: > It would be great if filters had the same advantages as parsers in > being able to simply test for equality (x==y) rather than having to > do a string comparison (x.equals(y)) when checking for a specific > element or attribute name. Yes, but as someone (James Clark?) pointed out during the last round, with most serious applications you're going to end up doing hash lookups anyway, so the == doesn't buy you much. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Sat Mar 27 20:29:45 1999 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:10:42 2004 Subject: LISTADMIN: The "Melissa" Virus Message-ID: This list was hit earlier by the "Melissa" virus; http://www.news.com/News/Item/0,4,34334,00.html Apparently,, not many anti-viral programs detect it yet. Please take great care with Word/Outlook combinations. If anyone knows of anti-viral tools that detect this, please let me and I will alert this list. Many thanks. Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax) http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 27 20:41:33 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:43 2004 Subject: Fast filter support in SAX2 Message-ID: <00a601be7892$f1d9e560$c8a8a8c0@thing1> From: Lars Marius Garshol >However, then we need to define the element stack interface and what >should be included there. Just the elements? Elements and attributes? >Elements, attributes and sibling number? Which entity each element >comes from? > >Maybe this should be done outside the SAX core? On the other hand, if >filters are included I think this should be too. I'm all for delaying things which are independent of the SAX2 core. It will be good to be able to focus on filter considerations, aka MDSAX2. The complication is when filter considerations impact SAX2. For example, where would be the best place for the intern method? I would hate to see it on Parser2, as that creates added overhead for each filter. (Yes, and I was the one who suggested it. :-) So far, I have a pretty short list of things we might need for filter structures: 1. An intern interface. 2. Request that element and attribute names be intern'ed. (Might be combined with a successful get on the intern interface.) 3. Element stack interface. 4. Application event routing. Necessary for non-linear filter structures where more than one filter needs access to the events coming from the application, like handler registration. In addition, I also see a need for a DOMWalker interface: public interface DOMWalkerContext { public Element getCurrentElement(); } A filter could ask this of its parser and then be able to process "parse" events based on their source in the DOM. A good start for a SAX-based XSL, I suspect. But like I said, this should wait. On the other hand, I would like to suggest that Parser2 NOT be derived from Parser. We could then have a pure SAX2 implementation, where things like document handler would be registered just like any other SAX2 event handler. This would make for much cleaner filter2s. And there's going to be a whole lot more filters than parsers, mm? Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Sat Mar 27 20:43:31 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:10:43 2004 Subject: SAX2: Proposed alternative DTD interface Message-ID: <01BE789A.BF03B320@grappa.ito.tu-darmstadt.de> David Megginson wrote: > It's worthwhile, perhaps, to ask whether there will be many XML > applications that > > a) require a small footprint; > b) need DTD information; and > c) can use the information in a streaming format. Point (c) is the one that gets me. All the DTD-based applications I can think of eventually need a set of objects over the DTD because they are either analyzing the DTD or continually checking against it. The only exception I can think of to this is Simon's validation routine in his layered parser, and he needs so much lexical information he's likely to be unhappy with an event-based DTD parser anyway. (A quick and dirty fix would be to redefine validation to mean logical validation, not physical validation.) (By the way, can we change ContentParticle.isOmissible to isOptional? I had to think a bit before I realized what isOmissible meant.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 27 20:44:56 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:43 2004 Subject: SAX2: Proposed alternative DTD interface Message-ID: <00ab01be7893$66902180$c8a8a8c0@thing1> What about sequential reuse of a parser? If its going to process the same DTD again, couldn't it have cached the DTD? And wouldn't a DTD event stream preclude this important optimization? B xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Sat Mar 27 21:41:18 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML Message-ID: <36FD4FA4.26BCB466@jfinity.com> I have been thinking about optimal XML parsing, partly as a result of the binary XML discussion. Right now the world of XML parsers is divided into well-formedness and validating. Another type being discussed is binary. I'd like to propose another, the half-baked parser. This parser is mentioned in the notes for section 5.1 of the annotated XML spec (not in a positive light :-). The half-baked parser can only process XML documents that don't have a prologue. This makes its memory footprint and execution path much smaller and faster respectively. Unfortunately, it isn't a legal XML parser anymore. This can be addressed by having a modular parser architecture that would be optimistic and try the half-baked parser first. If it encountered a prologue, it could load either a WF parsing module or a validating parsing module. I think that a highly tuned half-baked parser in combination with an optional stream-oriented compression scheme would address many of the concerns that something like binary XML is intended to deal with in both the transmission, storage and execution speed dimensions. A great discussion of modular layered parsing can be found on Simon St. Laurent's web site (www.simonstl.com). Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sun Mar 28 00:35:55 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:10:43 2004 Subject: IDL for SAX2 In-Reply-To: Your message of "27 Mar 1999 16:04:36 +0200." Message-ID: <199903280035.RAA09535@malatesta.local> > * uche ogbuji > | > | Furthermore, I've been thinking of proposing that the SAX2 > | interfaces be specified in IDL rather than Java (or at least > | publishing an IDL translatiuon when the interfaces are stabilized), > | and your proposal wouldn't wash in IDL. > > Many things in SAX won't wash in IDL, such as the use of the > Java-specific InputStream, Reader and Locale objects. Huh? Sounds like orthogonal matters to me. module spam { interface InputStream; interface eggs{ string foobar(in InputStream input); } } Is perfectly legal IDL. IDL does not concern itself with the implementation of any object: strictly interface, as its name promises. You can use InputStream, Reader, etc. etc. to your heart's content. In fact, you can even improve on the current Java approach by actually _defining_ the interface for those classes, saving non-Java users a spurious trip through the Javadoc. > Also, IDL has a problem in that it's sort of a least common > denominator, and thus leaves out many useful language-specific things. Examples, please. I don't think your above example of Java-specific objects really minimizes the usefulness of IDL. > So you'd probably want to do a manual translation anyway. > If there ever is a published SAX spec I think it should use IDL to be > politically correct and point out potential language-mapping problems. > However, the actual utility of IDL I think is low in this particular > case. It's not a matter of "politically correct". IDL is an excellent _engineering_ tool whenever you need to define interface. I have used it time and time in my career, and I find that the ability to generate stubs in any native language directly from the IDL, thus ensuring adherence to the interface, saves much development time. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Mar 28 00:38:21 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:43 2004 Subject: Proposed new kind of SAX2 thing, with example In-Reply-To: <14075.35687.586960.200728@localhost.localdomain> from "David Megginson" at Mar 26, 99 08:29:04 am Message-ID: <199903280144.UAA03387@locke.ccil.org> David Megginson scripsit: > Use the following from Parser2 (née ModParser): > > public abstract Object get (String prop) > throws SAXNotSupportedException; > Ah. In that case, please add another get method with an index value, and ditto for set. This way we can have indexed properties. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Mar 28 03:17:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML In-Reply-To: <36FD4FA4.26BCB466@jfinity.com> References: <36FD4FA4.26BCB466@jfinity.com> Message-ID: <14077.33404.430088.361367@localhost.localdomain> Gabe Beged-Dov writes: > The half-baked parser can only process XML documents that don't have a > prologue. This makes its memory footprint and execution path much > smaller and faster respectively. Unfortunately, it isn't a legal XML > parser anymore. No, you'll probably find that there's no speed difference at all (why would there be?). There will be a small size difference, but it will be less exciting than you think -- the code to detect the prologue and load the module will make up much of the difference. DTD validation really doesn't require much extra code, and the code, of course, isn't triggered unless you're validating in the first place; doing the well-formedness checks for legal characters can take up a lot of code, but you're supposed to do that anyway (I cheated with AElfred). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Sun Mar 28 04:41:22 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> Message-ID: <36FD95F9.7E93A231@jfinity.com> David Megginson wrote: > No, you'll probably find that there's no speed difference at all (why > would there be?). There would be a little speed difference from not having to check for defaulted attributes. The half-baked parser might also be able to directly point to the xml input without having to copy it, i.e. use start-length pointers for the tags and attrs. This would be more cumbersome if there was less of a one to one correspondence between the raw xml and what you got after expansion and defaulting. > There will be a small size difference, but it will > be less exciting than you think -- the code to detect the prologue and > load the module will make up much of the difference. Detecting the prologue and loading an alternate module takes a few lines of Java code. Prologue processing, entity expansion and attribute defaulting take up a little more than that in the parsers that I've looked at. > doing the > well-formedness checks for legal characters can take up a lot of code, > but you're supposed to do that anyway (I cheated with AElfred). I'm not sure I understand. Could you elaborate on how you cheated :-? Thanks, Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Mar 28 04:54:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML In-Reply-To: <36FD95F9.7E93A231@jfinity.com> References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> <36FD95F9.7E93A231@jfinity.com> Message-ID: <14077.38830.801250.747754@localhost.localdomain> Gabe Beged-Dov writes: [on a validating parser] > There would be a little speed difference from not having to check > for defaulted attributes. Not a measurable one -- the parser just needs to set a boolean flag when there are no default values available, then it doesn't have to check each time. > The half-baked parser might also be able to directly point to the > xml input without having to copy it, i.e. use start-length pointers > for the tags and attrs. This would be more cumbersome if there was > less of a one to one correspondence between the raw xml and what > you got after expansion and defaulting. I think that James Clark does something like that with Expat, which does read the prolog properly, though it doesn't expand external entities by default. At least, Expat can always return the exact string where an event originated. Most efficient XML parsers play pretty clever tricks with their input buffers, even with entity expansion. > > There will be a small size difference, but it will be less > > exciting than you think -- the code to detect the prologue and > > load the module will make up much of the difference. > > Detecting the prologue and loading an alternate module takes a few > lines of Java code. Well, a little more than that, because you'll have to pass the current state on to the new module. > Prologue processing, entity expansion and attribute defaulting take > up a little more than that in the parsers that I've looked at. The version of AElfred that I wrote was around 27K (uncompressed) including full parsing of element, attribute, and entity declarations, and expansion of external entities (including the external DTD subset); even then, AElfred would have been about 7K smaller if I hadn't written my own hashing, interning, buffer-handling etc. for speed's sake. I still believe that a 10K XML non-validating parser class in Java is not out of reach, *including* parsing the prolog, if people are willing to use the standard Java classes. > > doing the well-formedness checks for legal characters can take up > > a lot of code, but you're supposed to do that anyway (I cheated > > with AElfred). > > I'm not sure I understand. Could you elaborate on how you cheated :-? At least when I was maintaining it, AElfred didn't perform all of the required well-formedness checks for different ranges of Unicode characters allowed and not allowed in names, attribute values, character data, etc. I tried adding it, but it bloated the code by about 7-8K (much more than parsing the prolog and DTD). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Mar 28 05:18:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:43 2004 Subject: XML and (K)Office In-Reply-To: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca> from "Tim Bray" at Mar 26, 99 09:52:11 am Message-ID: <199903280424.XAA08855@locke.ccil.org> Tim Bray scripsit: > >How will future users make sense of the format without > >a DTD? > > And what, pray tell, part of a DTD helps you "make sense" of a > format? -Tim Hear, hear. I spent far too much time poring over the XMLspec DTD and some examples (which didn't quite match the published DTD any more), without understanding any too much of it. Reading the documentation (in English) made all the difference. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Sun Mar 28 05:30:53 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> <36FD95F9.7E93A231@jfinity.com> <14077.38830.801250.747754@localhost.localdomain> Message-ID: <36FDA19A.27DA7B45@jfinity.com> Another reason (other than the binary XML thread) that I brought this up was discussion on the perl-xml mailing list of whether XML::Parser was usable for soft real-time server side processing. The consensus there seems to be no. XML::Parser is layered on expat. Anecdotal evidence seems to be that there is an order of magnitude performance advantage to "parsing" something other than XML. The two alternatives are a textual format that Perl can eval directly (Data::Dumper) and a binary format (Storable). In both cases (Data::Dumper and Storable) there is conversion from the on-disk format to the in-memory format. Why is XML so much slower according to developer feedback? That is what I was trying to understand from other peoples experience rather than doing a hands-on analysis myself. I may have jumped to the conclusion that it was the extra work that a well-formedness processor has to do over what a half-baked processor would do. That still leaves the quesion of where the slowdown is and whether it is an implementation issue or inherent is some aspect of XML parsing. Thanks, Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Mar 28 06:34:51 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:43 2004 Subject: DOM: notations and unparsed entities In-Reply-To: <01BE788A.49DEA2E0@grappa.ito.tu-darmstadt.de> from "Ronald Bourret" at Mar 27, 99 07:44:54 pm Message-ID: <199903280536.AAA11704@locke.ccil.org> Ronald Bourret scripsit: > 1) How do I determine if an attribute's value is a notation or unparsed > entity? In the case of an unparsed entity, I'm guessing that the Attr node > has an EntityReference child (is this true?). Notations have me stumped. You can't tell. And no, the value of a NOTATION attribute is a string, not an entity reference (which is used only when you have actual &...; markup). The DOM conceals the XML type of attributes, at least at level 1. > 2) Is there a general DOM mailing list? The only one I could find was > www-dom@w3.org, which I assumed was for spec comments, not questions like > this. It's for all DOM talk, including spec comments. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Mar 28 06:35:48 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:43 2004 Subject: Fast filter support in SAX2 In-Reply-To: from "Lars Marius Garshol" at Mar 27, 99 09:32:24 pm Message-ID: <199903280541.AAA11787@locke.ccil.org> Lars Marius Garshol scripsit: > This terminology is also used in Common Lisp and Python, and probably > many other places as well. It was born in the LISP environment, where atoms were "interned on the OBLIST" to make them unique. > However, then we need to define the element stack interface and what > should be included there. Just the elements? Elements and attributes? > Elements, attributes and sibling number? Which entity each element > comes from? Just the element types, IMHO. This is very easy to expose as Strings for almost any kind of parser. If you want more, do it yourself. > Maybe this should be done outside the SAX core? It certainly can be, but the parser is already doing it, and why reinvent the wheel? -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Sun Mar 28 06:42:26 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:43 2004 Subject: Why doesn't XML have Bag? Uh, "set" References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it> <36FB76BA.EBE5172D@mitre.org> <14075.36327.509783.485757@localhost.localdomain> <36FB93E5.F662B56A@mitre.org> Message-ID: <36FDB30C.93E227EC@allette.com.au> Roger L. Costello wrote: > Thanks Dave for clarifying terminology. It is "set" that I meant, not > "bag". Just to make certain that I understand, an XML DTD cannot > express the following: > > "A element contains exactly three child elements: one instance > of , one instance of , and one instance of , > and these child elements can appear in any order." > > Correct? XML cannot, but as has been pointed out, SGML can. This is a classic situation where the use of an SGML validation stage may be cheap and useful. You can check the structure with the rigid model, then make the appropriate assumptions when using the XML. The XML content model might not reflect your strict requirements of the data, but the overall process does. The role of semantic checking may at some stage be taken over by a schema, but until then an SGML parse can provide the rigidity that you need. This might be as simple as just remapping a single parameter entity and applying a different parser. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Mar 28 07:10:57 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:43 2004 Subject: half-baked parsers vs binary XML In-Reply-To: <14077.38830.801250.747754@localhost.localdomain> from "David Megginson" at Mar 27, 99 09:54:58 pm Message-ID: <199903280616.BAA12772@locke.ccil.org> David Megginson scripsit: > At least when I was maintaining it, AElfred didn't perform all of the > required well-formedness checks for different ranges of Unicode > characters allowed and not allowed in names, attribute values, > character data, etc. I tried adding it, but it bloated the code by > about 7-8K (much more than parsing the prolog and DTD). According to the corrigenda, attribute values and character data can now contain anything except (hex) 0000-0008, 000B-000C, 000E-001F, (ASCII controls), D800-DFFF (surrogates), and FFFE-FFFF (non-characters). Everything else should be allowed. There are some rules in Appendix B of XML that allow you to leverage the methods in Character. When I get a chance, i"ll write some Java code that correctly recognizes XML name and name-start characters. The big tables are already in the java.lang.Character class. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Sun Mar 28 08:23:43 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:10:43 2004 Subject: Is there anyone working on a binary version of XML? References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain> <36FD16C2.B386B0F1@toolsmiths.se> Message-ID: <36FDA44B.C48CA611@iol.ie> "Anders W. Tell" wrote: > However in the case of DOM to DOM > communication its possible to to much better than using XML text a content carrier. > Don't forget about IIOP, the CORBA "RPC" protocol. While not the absolutely optimal transport protocol, it has been optimised by a group of experts (I think :-) for platform, transport, endian, etc, independence. This is why the DOM interfaces are defined in OMG IDL. You can take the DOM interface definitions and feed them through an IDL compiler, which will generate client (local) and server (remote) transport stubs. These stubs, which can be in any language supported with an IDL compiler, take care of all parameter marshalling, etc, for transport across a network, between address spaces, etc. If you want to experiment with an IDL compiler for JAVA, OrbixWeb is pretty good, and you can get a 60 day evaluation from the IONA web site, at http://www.iona.com/info/products/orbixweb/index.html There are free IDL compilers and ORBs available too. This takes care of eliminating tags from the communication stream (although these would be replaced by a wire representation of the method name and parameters), since a parsed DOM structure could communicate directly with another (possibly remote) parsed DOM structure. However, the actual element content would be still be transported in full representation, with no compression. To deal with situations such as this, OrbixWeb has a non-CORBA standard facility called "transformers". This is basically a filter callback where you can process data being marshalled as it goes outside an objects address space, to transform it in whatever way you wish, including changing its representation. In this case, the obvious requirement is to compress the data in some way. Note however that the remote DOM would have to have a comptatible "un-transformer" to reconstitute the encoded element content. As for what DOM to DOM communication actually means, I think that XLink is a prime use for such communication, particularly the transclusion stuff. But that's a whole other subject. I think there are some very blurred boundaries here between a HTTP like client/server facility and direct communication between persistent objects on separate machines. If you take the view that XML documents are files that are to be transferred in whole or in part between machines, then the HTTP style approach is the right one. But if you take the OMG-CORBA approach of "Objects exist; you (the client) don't need to know where or how they are stored. Simply refer to their object reference and they will be instantiated for you and made available as if they were in your local address space." I think the latter is going to be the way that things will go. Picture every XML document as being available, fully parsed, and available to you (permissions excepted) as if it was in your local address space. In this paradigm, HTTP style servers would no longer exist. Web servers would simply be replaced by ORBs. And CORBA is an *open* standard. Alan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Sun Mar 28 10:25:29 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:44 2004 Subject: Fw: Is there anyone working on a binary version of XML? Message-ID: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il> Stephen D. Williams wrote: >One other subject that I haven't mentioned, but need for another architecture that I designed >a while ago is a mechanism for 'parallel inheritance' overlay tree processing. Has anyone >else worked on this? The idea is to have one or more base trees and work with a delta tree >which represents changes from the underlying trees. This last part is a basic data structure >for a rule engine and metadata application environment I designed last year. For general XML trees, I think you'll find that the only way to describe a 'delta' on a tree is using an XSL stylesheet, or something as complex, so you might as well stick with XSL. We use "delta trees" very heavily, but in a somewhat specialized form suitable for our application - the input trees have to be in a very strict format and the set of operations is much narrower then allowed in XSL. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sun Mar 28 11:21:37 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:44 2004 Subject: Fast filter support in SAX2 In-Reply-To: <199903280541.AAA11787@locke.ccil.org> References: <199903280541.AAA11787@locke.ccil.org> Message-ID: * Lars Marius Garshol | | However, then we need to define the element stack interface and what | should be included there. Just the elements? Elements and attributes? | Elements, attributes and sibling number? Which entity each element | comes from? * John Cowan | | Just the element types, IMHO. This is very easy to expose as Strings | for almost any kind of parser. If you want more, do it yourself. In that case you'll need to make your own stack in addition to the element stack. Maybe we should consider providing some means of annotating the element stack? Some kind of property/value scheme? * Lars Marius Garshol | | Maybe this should be done outside the SAX core? * John Cowan | | It certainly can be, but the parser is already doing it, and why | reinvent the wheel? Because, like I pointed out earlier, the parser probably has more than just element names in its stack, like the entity it appeared in, and in some cases possibly location information as well. That needs to be stripped out when presenting this to the application, which may well be slower than letting the application do this for itself. And for many applications more information in the stack is needed, and then what? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sun Mar 28 11:45:06 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:44 2004 Subject: IDL for SAX2 In-Reply-To: <199903280035.RAA09535@malatesta.local> References: <199903280035.RAA09535@malatesta.local> Message-ID: * Lars Marius Garshol | | Many things in SAX won't wash in IDL, such as the use of the | Java-specific InputStream, Reader and Locale objects. * uche ogbuji | | IDL does not concern itself with the implementation of any object: | strictly interface, as its name promises. You can use InputStream, | Reader, etc. etc. to your heart's content. In fact, you can even | improve on the current Java approach by actually _defining_ the | interface for those classes, saving non-Java users a spurious trip | through the Javadoc. Maybe, but you don't want these to be InputStream, Reader and Locale in Python, C++ or Common Lisp. You want them to be Pyton file objects, C++ streams and Common Lisp streams. So although it may work for Java it won't work as well everywhere else. * Lars Marius Garshol | | Also, IDL has a problem in that it's sort of a least common | denominator, and thus leaves out many useful language-specific | things. * uche ogbuji | | Examples, please. Python 'magic' methods such as __getitem__, Common Lisp generic methods, Eiffel/Sather invariants and post-/preconditions, Sather iterators, Python/Common Lisp keyword arguments, Java/C++ overloading and so on. This is a problem that isn't really avoidable when you want to seamlessly cross language boundaries, but it's not clear to me that that is what we want to do in this particular case. | I don't think your above example of Java-specific objects really | minimizes the usefulness of IDL. Not in general, but I think the fact that you want to map those objects to different kinds of objects in different languages does mean that IDL can't be used directly anyway, and then you may as well do the whole translation manually. To take one example we've now introduced AttributeList2, a subclass of AttributeList, which is passed to the usual startElement method. In Java you need to cast this object to get at the new methods, which is awkward, means a run-time type-check and sort of defeats the point of having typing in the first place. In Common Lisp you'd rather have (defmethod start-element((dh my-document-handler) (name string) (al attribute-list)) (error "Dang, we need a SAX 2.0 parser!")) (defmethod start-element((dh my-document-handler) (name string) (al attribute-list2)) ; safely use attribute-list2 with no casting, no performance penalty ; and no typing problems ) This IDL can't do for you, because IDL doesn't have the concept of generic methods. * Lars Marius Garshol | | If there ever is a published SAX spec I think it should use IDL to be | politically correct and point out potential language-mapping problems. | However, the actual utility of IDL I think is low in this particular | case. * uche ogbuji | | It's not a matter of "politically correct". IDL is an excellent | _engineering_ tool whenever you need to define interface. I have | used it time and time in my career, and I find that the ability to | generate stubs in any native language directly from the IDL, thus | ensuring adherence to the interface, saves much development time. When you want to talk to implementations in other processes, on other computers and in other languages, yes. But I don't think that's really what we want in this case, to the cost of having less natural translations to the various languages. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sun Mar 28 12:07:38 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:10:44 2004 Subject: Proposed new kind of SAX2 thing, with example In-Reply-To: <199903280144.UAA03387@locke.ccil.org> References: <199903280144.UAA03387@locke.ccil.org> Message-ID: * David Megginson | | Use the following from Parser2 (née ModParser): | | public abstract Object get (String prop) | throws SAXNotSupportedException; * John Cowan | | Ah. In that case, please add another get method with an index | value, and ditto for set. This way we can have indexed properties. You can anyway, if you just use a Vector or some equivalent as the property value. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Mar 28 12:39:57 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:44 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson >Rick Jelliffe writes: > > > One trivial way to minimise file sizes for transmission is to > > collapse white-space inside markup (e.g. [\ \t \n\ r]+ becomes > > [\n]), > >Yes, that might be helpful (but only minimally in most cases). The reason I suggest it is this: at several stages in a network there is liable to be some point-to-point compression. In particular, of course, at the modem of the receiver (well, most receiving ends). XML's verboseness can be partially justified by the existance of this compression. Attempting to compress already-compressed data does not always lead to increased benefits: in fact, compressing already-compressed data can easily lead to larger files, which is why many compression systems first check that they have made any gains before writing out the compressed blocked. (And if you are going through 7-bit mail systems, then you can increase your transmission size by compressing data, if the data is ACII.) When judging an XML compression, it is important to judge its effect after being recompressed by the kind of compression that is found in modems (i.e., at the bottleneck): the simple, fastest deflate found in gzip can be useful. Furthermore, it is important to recognise that, because of the slow-start algorithm in TCP/IP and the WWW having quite long ACK delays, a compression of 2:1 is not the same thing as a doubling in arrival speed: more data will arrive earlier in each packet group, but the number of packet groups may be the same. In the case of the binary version of XML being mentioned, it would be interesting to see the four-way comparison (raw XML, binary "XML", compressed XML, compressed binary). One interesting results of my tests on the interaction of short-referencing and compression was that collapsing white-space was (for my independently-produced RDF test files) just as effective as short-referencing. (One reason might be that many compression algorithms only have a certain dictionary size, and a certain match-string size: reducing unneeded white-space may free up dictionary entries and allow more useful match-strings. Especially for on-the-fly compression, such as modems. ) I was surprised, because I thought that white-space was fairly insignificant: but I was wrong, for the data I was using (some data would fare better, I would hope, but some may be worse). So developers should pay attention to letting users keep their file sizes down: a 10 percent reduction in file size may not seem much, but if, at an extreme, all the packets are just over the size of the first packet group and the ACK latency is greater than the packet transmission time, it can result in the files completing in half the time. At the smaller file sizes of XML, and the trends to linking to external stylesheets and so on, reducing the crap in headers is quite important. In fact, I would think that it was good policy to have no unneccessary whitespace in header data in XML documents. >> and to minimize whitespace in data: (removing trailing spaces, [\ >> \t]+\n) becomes [\n], is a safe transformation, for example.) >No. It might be a safe transformation for specific XML formats, but >not for XML in general, because you don't know what people might be >using that whitespace for. Of course. But in practise text editors and some kinds of processing systems will often strip out trailing whitespace on opening or closing. So I should have said something like "It is not prudent to generate '[\t\s]+\n' where the whitespace is significant unless you are sure how software which uses that data treats trailing white-space." In any case, I was trying to say that one good way to reduce file sizes is to not generate unneeded characters in the first-place: I was not proposing an external compression mechanism based on white-space collapsing. Rick xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Sun Mar 28 13:23:44 1999 From: harvey at eccnet.eccnet.com (Betty L. Harvey) Date: Mon Jun 7 17:10:44 2004 Subject: Melissa Virus Article In-Reply-To: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU> Message-ID: Since this listserve was hit with the Melissa Virus Friday night, I thought some of you might be interested in an article in yesterdays Washington Post concerning the virus: http://www.washingtonpost.com/wp-srv/business/daily/march99/virus27.htm Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Mar 28 13:31:26 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:44 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) Message-ID: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU> From: David Megginson >Rick, you're still pointing to implementation details rather than >abstract modelling. Try to express the question in terms of the thing >being modelled -- for example, at a project meeting, the system >architect might ask the following question: > > Can SGML and XML both model a reference to a photograph, providing > the width, height, and colour depth? > >The answer, of course, is 'yes' ... (Sad and somewhat mysterious story deleted: what is the point? That anyone who discusses what information is implied by XML markup is a boofhead? In any case, the forum here is not XML-DEV, not a company design meeting.) Huh? At some level of abstraction all distinctions disappear: XML becomes the same as ethernet when the abstraction is "things that can transport characters". Dave seems to be saying that ( X , X, X ) is the same as ( X, what-he-said, what-he-said) I agree that (to bend LISP out of shap) eval( X, X, X) is the same as eval(X, what-he-said, what-he-said) but Dave seems to be saying that the fact that two things (pointed to) are the same is not "information". That seems an extraordinary claim. john rover rex encodes more information than rover rex unless there is schema defininion in effect somehow somewhere that the strings in owner attributes follow the rule one-name=one-owner. In the first version, that information is part of the model. In the second, it is not. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Sun Mar 28 17:08:04 1999 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:10:44 2004 Subject: OT: Melissa virus fix from NAI Message-ID: <199903281507.HAA02264@bmhughes.com> There's a fix for the Melissa virus from NAI: http://www.avertlabs.com/public/datafiles/valerts/vinfo/melissa.asp W97M/Melissa 3/27/99 W97M/Melissa Melissa is a Word 97 Class Module Macro virus that can also be upconverted to a Word 2000 Macro Virus. It was first discovered by NAI's Dr Solomon's VirusPatrol today on the alt.sex newgroup. The virus has spread rapidly around the world, and has infected thousands Symptom The virus can infect a system by being received from another infected user via Outlook. This appears to be the most common method of infection. Users will not know they have been infected, nor will the sender know the document has been sent. A user may become alerted to the infected document if the Macro Security settings are enabled. This warning will be displayed to the user when the document is opened. Pathology When the infected document is opened, the virus checks for a setting in the registry to test if the system has already been infected. If the system hasn't been infected, the virus creates an entry in the registry: HKEY_CURRENT_USER\Software\Microsoft\Office\"Melissa?" = "... by Kwyjibo" (If this key exists the email process will not execute, the virus will still infect. AVERT advises that it not be removed.) (As a preventive message you can create this registry key to prevent the virus from launching) This virus also creates an Outlook object using Visual Basic instructions and reads the list of members from Outlook Global Address Book. An email message is created and sent to the first 50 recipients programatically all the address books, one at a time. The message is created with the subject "Important Message From ? " The message body of text reads "Here is that document you asked for ... don?t show anyone else ;-)". The active infected document is attached and the email is sent. The most prevalent document being seen is one called List.DOC, however this is NOT the only document that can be sent or received. Once the system is infected all documents that are opened are infected. As any document can be sent, a user that receives the infected document, who hasn?t been infected, can become infected with this document, and the process will continue. The virus does have a payload. If the day equals the minute value, and the infected document is opened this text is inserted at the current cursor position: " Twenty-two points, plus triple-word-score, plus fifty points for using all my letters. Game's over. I'm outta here." This virus checks for low security in Office2000 by checking the value from the registry; if the value HKEY_CURRENT_USER\Software\Microsoft\Office\9.0\Word\Security\"Level" is not null, the virus will disable the "MACRO/SECURITY" menu option. Otherwise Word97 menu option "TOOLS/MACRO" is disabled. Comments inside the macro virus include: 'WORD/Melissa written by Kwyjibo 'Works in both Word 2000 and Word 97 'Worm? Macro Virus? Word 97 Virus? Word 2000 Virus? You Decide! 'Word -> Email | Word 97 <--> Word 2000 ... it's a new age! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Sun Mar 28 17:57:26 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:44 2004 Subject: Fw: Is there anyone working on a binary version of XML? References: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il> Message-ID: <36FE594D.C43A4F2C@lig.net> Oren Ben-Kiki wrote: > Stephen D. Williams wrote: > >One other subject that I haven't mentioned, but need for another > architecture that I designed > >a while ago is a mechanism for 'parallel inheritance' overlay tree > processing. Has anyone > >else worked on this? The idea is to have one or more base trees and work > with a delta tree > >which represents changes from the underlying trees. This last part is a > basic data structure > >for a rule engine and metadata application environment I designed last > year. > > For general XML trees, I think you'll find that the only way to describe a > 'delta' on a tree is using an XSL stylesheet, or something as complex, so > you might as well stick with XSL. We use "delta trees" very heavily, but in > a somewhat specialized form suitable for our application - the input trees > have to be in a very strict format and the set of operations is much > narrower then allowed in XSL. I don't understand how to use XSL in a general way to acheive a 'delta tree' architecture. I have a vague idea, but nothing that I could see being automated sufficiently. Can you elaborate? In my case I'm really talking about a specialization also. Certain processing or data interpretation rules would have to be used, although these could be specified with attributes to allow a full range of possibilities. The situation that I am solving is where you have a base XML document and want to treat it as a read-only base where changes are made to an overlayed read-write layer (or layers). 'Lookups' would traverse a series of trees to determine the current state. The problems are related to ambiguous situations such as whether a read-write entity replaces or adds to an underlying layer, how to handle deletes, etc. There are a number of possible partial solutions, but it's difficult to find a completely general solution. For instance, using unique ID's creates a problem of managing and assigning unique ID's. This kind of thing really does have real-world application. A year ago I designed a rule engine for business rule processing in a web application that used this kind of data structure. The rulebase could have thousands of entries for structure and metadata where the session state for each user would only consist of a few fields that were modified or had values. Obviously a great optimization. Actually developing this is still on my short list. I don't know whether this 'delta tree' aspect has solid prospects for becoming commonly used, but I need it. sdw > Have fun, > > Oren Ben-Kiki > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Mar 28 18:01:48 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:44 2004 Subject: Is there anyone working on a binary version of XML? In-Reply-To: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU> Message-ID: Hi, We have also to consider that the transport mechanism can provide compression. HTTP has already this unused feature. I made some tests with HTTP 1.1 and compression transport and got for any text transport some improvements. Because the compression is taken care in the transport layer, the XML parser receives a document as usual. When I probed the mechanism further here is what's happening in IE (Currently we are working on a paper for new architecture for Mozilla Netlib). IN IE 5.x the server sends the data in HTTP 1.1 compressed mode. the client receives the data in compressed mode, then for each 8K chunks calls a MIME filter that does the decompression. After the decompression is done by the MIME filter, the transaction handler give the chunks to the document handler and therefore to the parser. In Mozilla we do not have this intermediary step. The transaction handler deals only with a protocol handler and a MIME filter is missing. To be more versatile and be able to process the stream adequately we have to add the notion of filters. So, as already mentioned in Simon's XML architecture we can say that browsers implements some parts of the layers like: a) transaction processing b) MIME filtering (compression, decryption, etc...) c) Document handling routing So in a browser's architecture we have: Transaction handler ----> Protocol handler -----> MIME filter------> Document manager ------> document handler (here you find the XML parser) In IE, these elements are already independant of the browser and are encapsulated in a module called UrlMon. basically, you provide a display name like "http://www.netfolder.com/ and urlmon act as a transaction handler, calls the MIME filters and give you the data. In Mozilla new Gecko architecture this is also a separate XPCOM module named Netlib. You also provide a display name, the module act as a transaction handler, calls the right protocol hanlder and give you the data (encrypted, uncompressend, etc). There is work in progress to add the notion of MIME filters. Both module share a similar kind of interface based on COM. This simply means that an object can have multiple interface and that a Query interface mechanism is available to obtain the right one. So Mozilla XPCOM and Microsoft COM implementations are very similar. Both kind of objects have binary signatures in the form of a C++ pure virtual interface. Both interfaces have mendatory members like AddRef, Release and QueryInterface. The difference being the way you instanciate objects and how they are registered. But, in both cases, the transaction manager and its helper modules: protocol handlers and MIME filters are accessible to other applications than browsers. There is also an other thread of evolution named HTTP NG. As you know, HTTP is like an remote object with Get, Put, Post, Delete methods. WebDav (already implemented in IE 5.x WebFolders) recently added PROGET, PROPUT, etc.... HTTP NG intents is to be able to create any kind of objects and the actual HTTP 1.1 (WebDav) being one type of object: a Document object. So, with HTTP NG, you'll be able to create your own object. This means that the evolution of HTTP on one side and XML on the other side is creating a concurrent to OMG. Here's why: OMG as already mentioned in this list is a middleware with: a) an interface language that could be mapped to different languages for concrete implementation b) a marshalling format for object communication. So, here is what's happening in the Web world: a) the interface definition language is still absent and it could be OMG IDL. But today, no choices has been made on this. We'll need some more work on the HTTP NG front before this happens. But when this will happens, the web will be a web of diverse distributed objects and not solely distributed documents. b) the marshalling format is becoming more and more XML. However the drawback for XML is that it is wasting a lot of transport bandwidth compared to other more efficient formats. To palliate that, the transport layer tries to use compression to reduce the packets payloads. HTTP 1.1 already provides this, but this is not used very often (if not at all). As you know, things are moving slowly. In the next 3 to 4 years, most browsers in the field will support HTTP 1.1 and compression (about 65% does today). The worst case is server side where a lot of improvised servers are not well configured to support adaptative compression negociation even if they can. This is mainly a knowledge barrier and not a technical issue. but in some years form now, compressed transport will be the de facto way. This was my Sunday morning .2 cents. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sun Mar 28 18:04:46 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:44 2004 Subject: Fast filter support in SAX2 Message-ID: <006501be7935$70e78040$c8a8a8c0@thing1> From: David Megginson >Yes, but as someone (James Clark?) pointed out during the last round, >with most serious applications you're going to end up doing hash >lookups anyway, so the == doesn't buy you much. At first blush, I had to agree with you. But consider the more interesting pattern matching scenarios. Its not always reasonable to have to map all processing into a hash lookup. I'm really just suggesting a capability here. Just another way to tune an application. If interned strings are used by the parser, why not share that capability with filters/applicaitons? Suppose we have a parser-kernel that we want to use with some new wonderful schema that has been implemented in a filter? Something that allows content validation based on ancestor patterns? Unless you are willing to right some pretty convoluted code, interned strings would be helpful. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sun Mar 28 18:18:55 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:45 2004 Subject: Is there anyone working on a binary version of XML? References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain> <36FD16C2.B386B0F1@toolsmiths.se> <36FDA44B.C48CA611@iol.ie> Message-ID: <36FE566D.B413A140@toolsmiths.se> Alan Kennedy wrote: > Don't forget about IIOP, the CORBA "RPC" protocol. While not the absolutely optimal transport > protocol, it has been optimised by a group of experts (I think :-) for platform, transport, > endian, etc, independence. And dont forget Microsoft DCOM and DCE :) > If you want to experiment with an IDL compiler for JAVA, OrbixWeb is pretty good, and you can get > a 60 day evaluation from the IONA web site, at > > http://www.iona.com/info/products/orbixweb/index.html > > There are free IDL compilers and ORBs available too. I use TAO for my experiments since I manly implement in C/C++ Its one of the few Corbas with real-time extensions. > > But if you take the OMG-CORBA approach of "Objects exist; you (the client) don't need to know > where or how they are stored. Simply refer to their object reference and they will be instantiated > for you and made available as if they were in your local address space." One problem with the current design of DOM IDL is that each element in a XML document is an Corba Object and for large document this means a *lot* of object references. However the newer POA have a design which handles UseCases like this better than the old BOA. Unfortunately not many ORBs have implemented it. My view on remote DOM trees is that in most cases its more efficient to transfer the whole document to the client side and access it there in a local DOM tree. The reason for this is that traversing a remote DOM tree using current IDL's would generate an enormous amount of network traffic and the document transfer +DOM building time on the client side would significantly smaller. /Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Sun Mar 28 18:27:50 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:45 2004 Subject: Fw: Is there anyone working on a binary version of XML? References: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il> Message-ID: <36FE5885.2EAFCB88@toolsmiths.se> Oren Ben-Kiki wrote: > For general XML trees, I think you'll find that the only way to describe a > 'delta' on a tree is using an XSL stylesheet, or something as complex, so > you might as well stick with XSL. Interesting! Im about to go into this area very soon and would be very interested in any pointers to how to represent XML/DOM tree deltas. Best /Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Mar 28 18:38:45 1999 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:10:45 2004 Subject: OFF: Re: Melissa Virus Article References: Message-ID: <36FE5E93.27DEF3C4@finetuning.com> so then, would a fix be to not read email when you have a word doc open? And this only effects outlook users? Fred, were you using outlook? thanks, lisa Betty L. Harvey wrote: > > Since this listserve was hit with the Melissa Virus Friday > night, I thought some of you might be interested in an > article in yesterdays Washington Post concerning the > virus: > > http://www.washingtonpost.com/wp-srv/business/daily/march99/virus27.htm > > Betty > > /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ > Betty Harvey | Phone: 301-540-8251 FAX: 4268 > Electronic Commerce Connection, Inc. | > 13017 Wisteria Drive, P.O. Box 333 | > Germantown, Md. 20874 | > harvey@eccnet.com | Washington,DC SGML/XML Users Grp > URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ > /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Mar 28 20:27:39 1999 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:10:45 2004 Subject: OFF: Re: Melissa Virus Article References: <36FE5E93.27DEF3C4@finetuning.com> Message-ID: <36FE7810.94AAA89F@finetuning.com> never mind and sorry everybody. just killing this thread myself ...(scream) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sun Mar 28 20:31:43 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:10:45 2004 Subject: IDL for SAX2 In-Reply-To: Your message of "28 Mar 1999 11:44:58 +0200." Message-ID: <199903281831.LAA10836@malatesta.local> > > * Lars Marius Garshol > | > | Many things in SAX won't wash in IDL, such as the use of the > | Java-specific InputStream, Reader and Locale objects. I understand your point much better than after your first post, thanks. I had the impression that you were saying that certain interfaces that happen to be implemented in Java could not be implemented in IDL. So you say that IDL is more useful if one desires direct language and platform transparency, rather than as a general protocol-definition language. I agree with that assessment, but I'll also point out that it's no worse in that regard than Java. All of the litany of non-Java language-specific elements you mention still need to be translated from Java, as they would from IDL, so I still don't see how that acts as an argument against IDL. Java doesn't support Python's __getitem__ semantics function, for instance. When using IDL purely for design presentation, you can add all the comments you like to motivate language-specific features. At least, then you have a common core, and the language-specific elements are a clear departure, rgather than something one has to puzzle out from the behavior of Java. If there were another language that supports defining the interface with more flexibility for language-specific constructs, I wouldn't mind using that rather than IDL. Do you have any to suggest? As it is, however, as parsers come in C, C++, Python, Java, Perl, etc., and I don't see why we shouldn't use the most widely recognized middle-ground language for sharing interface between these languages (maybe recognition is the politics you were referring to earlier, but I choose to believe that IDL has real merits). But as I type, I realize that the great majority of contributors to SAX2 seem to have a Java bent, so maybe it's just best for Dave Meggison to publish Java-SAX2, and to have it translated to IDL (I guess I'll volunteer to do so, as I'm the lone advocate so far). I do think that this will help others outside this list as they have to implement SAX2 in their work. After all, we want more standardization around SAX, right? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Mar 28 20:45:22 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:45 2004 Subject: half-baked parsers vs binary XML Message-ID: <3.0.32.19990328104316.00bb89d0@pop.intergate.bc.ca> At 07:27 PM 3/27/99 -0800, Gabe Beged-Dov wrote: >XML::Parser is layered on expat. Anecdotal evidence seems to be that there is an order of >magnitude performance advantage to "parsing" something other than XML. The two alternatives >are a textual format that Perl can eval directly (Data::Dumper) and a binary format >(Storable). That's because there is some breakage in the design of the expat/perl linkage. In my tests, given a file of a couple of meg, perl can read it in under a second, xmlwf (i.e. raw expat) in an almost unmeasurably-short time, and XML::Parser takes 10 seconds plus. This is just a bug and will be fixed. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jgarrett at navix.net Sun Mar 28 23:45:35 1999 From: jgarrett at navix.net (Jim Garrett) Date: Mon Jun 7 17:10:45 2004 Subject: Melissa virus fix - (in case you haven't already been there) In-Reply-To: Message-ID: <000701be7962$3beebc50$58c8c8c8@jgp400> Melissa Virus fix...FYI (in case you haven't already been there) http://www.microsoft.com/security/bulletins/ms99-002.asp http://officeupdate.microsoft.com/downloaddetails/wd97sp.htm |-----Original Message----- |From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of |Rzepa, Henry |Sent: Saturday, March 27, 1999 2:31 PM |To: xml-dev@ic.ac.uk |Subject: LISTADMIN: The "Melissa" Virus | | |This list was hit earlier by the "Melissa" virus; | |http://www.news.com/News/Item/0,4,34334,00.html | |Apparently,, not many anti-viral programs detect it yet. |Please take great care with Word/Outlook combinations. |If anyone knows of anti-viral tools that detect this, please |let me and I will alert this list. | |Many thanks. | | |Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax) |http://www.ch.ic.ac.uk/rzepa/ | |xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk |Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on |CD-ROM/ISBN 981-02-3594-1 |To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; |(un)subscribe xml-dev |To subscribe to the digests, mailto:majordomo@ic.ac.uk the |following message; |subscribe xml-dev-digest |List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jackpark at thinkalong.com Sun Mar 28 23:57:45 1999 From: jackpark at thinkalong.com (Jack Park) Date: Mon Jun 7 17:10:45 2004 Subject: Virus in my last e-mail In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com> Message-ID: It's bad enough that you send viruses. Worse yet, you force your vcf card on all of us. At 04:09 PM 3/26/99 -0800, you wrote: >Folks, > >The last e-mail I sent had a virus in the attached word document. PLEASE >don't open the document. In our office it caused Outlook 98 to autosend >itself to everyone on our address lists, turned off virus checking in word >(tools/options/general/macro virus protection), and modified the default >template normal.dot. > >Sorry! > > > <> > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrys at microsoft.com Mon Mar 29 00:51:50 1999 From: mrys at microsoft.com (Michael Rys) Date: Mon Jun 7 17:10:45 2004 Subject: Important Message From Fred McLain (READ FIRST!!!) Message-ID: <25983782061AD111B0800000F86310FE14282F76@RED-MSG-42> This mail contained the MELISSA Word macro virus. > ****** Message from InterScan E-Mail VirusWall NT ****** > > ** WARNING! Attached file list1.doc contains: > > W97M_MELISSA.A virus > > The infected file has been cleaned. > You will be sent a separate e-mail with the cleaned file. > > Please go to [internal web page] and install Inoculan. If > already installed please ensure you have the latest signiture > file. > > ***************** End of message *************** > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Mar 29 01:48:50 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:10:45 2004 Subject: xhtml and the p tag Message-ID: <02fa01be7975$b1bf6e80$0200a8c0@mdaxke.mediacity.com> (not sure where xhtml discussion should go; all i see on the www-html@w3.org list are stultifying discussions about tag case-sensitivity.) in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ , the p element is %Inline, which means it can't include any block level elements such as ul. So now we have a quandary. in practical terms, what i usually want is something like a non-existent . That doesn't exist, because in browsers a
breaks the line; it doesn't end the paragraph, and particularly now with xhtml,

is deprecated. But even if I *did* the extra work to wrap

...

around my paragraphs, that still wouldn't work, because a p can't enclose any block level elements such as a ul. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 29 02:47:39 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:45 2004 Subject: Whence XQL? In-Reply-To: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>; from Jonathan Robie on Thu, Mar 25, 1999 at 10:09:15PM -0500 References: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <19990326133124.B7318@io.mds.rmit.edu.au> <3.0.3.32.19990325220915.032a1480@pop.mindspring.com> Message-ID: <19990329104719.A13271@io.mds.rmit.edu.au> On Thu, Mar 25, 1999 at 10:09:15PM -0500, Jonathan Robie wrote: > At 01:31 PM 3/26/99 +1100, Marcelo Cantos wrote: > > >I could be disingenuous ( :-) ) and suggest that the attachment to > >Microsoft has more than a little to do with its success to date, but I > >certainly don't want to disparage the effort in its own right. It > >offers a good compromise between expressivity and simplicity, which is > >a far more practicable goal than completeness. > > Well, Microsoft was one of the first companies I got interested in XQL ;-> > > >I am concerned (am I right on this?) at the lack of proximity > >operators. But that's just an implementor's perspective, looking at > >doing things we already support. > > Cool, you work on SIM? (Does that make you a SIMian?) Cute! It might just take off around here. :-) > I really enjoyed > talking to Timothy Arnold-Moore at Markup Technologies '98 - Makoto > Murata-san and I managed to snag him after his presentation and grill him > with questions for a while. > > I've gone back and forth on proximity operators. Several people who have > implemented full-text search systems have told me that users don't really > use proximity operators, that they are useful in the implementation, but > need not be exposed to the user. Others vehemently disagree. I took the > pragmatic approach of leaving it out to see who would complain. Frankly, > you are the first to do so. I do wonder what proportion of people looking seriously at XQL are into text. We find WITHIN N to be exceedingly useful. It is also interesting to note that we only offer proximity at the word level and that this is all clients ever really want. We do also offer same sentence/paragraph queries, but virtually no-one uses them. > I have discussed proximity searching as a possibility in the > following paper: > > http://www.w3.org/TandS/QL/QL98/pp/murata-san.html > > Here's an excerpt: > > > > In addition, functions for proximity searching might be useful. The > following returns elements in which "rose*" and "sweet*" > occur within 10 words of each other: > > LINE[near("rose*", "sweet", 10)] This would match lines like these: > > A rose by any other name would smell as sweet. > Sweet roses grew along the south side of the fence. > She rose and smiled sweetly at the purple dwarf under the > bucket. Say, has anybody seen my Sweet Gypsy > Rose? > > Proximity searching requires some way to indicate how close the > strings must be in order to match. This causes a difficulty when > choosing the units in which proximity is measured. In existing > full-text systems, distance is frequently measured in terms of > words, which raises a number of significant questions regarding > internationalization, but is probably an intuitive way to measure > distance for most users. > > > > I'm not sure whether this is the best approach or not. Do you like > this approach? If not, what approach would you prefer? It's an interesting angle, though not one I had considered (not that I have considered many angles :-). I had understood, perhaps incorrectly, that the only way to perform word-level boolean queries was to treat words abstractly as leaf nodes of the document tree rather than clumps of opaque string data. Under this conception, to find "other name", one would say: LINE[WORD="other"; WORD="name"] It could possibly be made legal to abbreviate the above to: LINE["other"; "name"] Which would be interpreted as, "a Line element which is the parent of a leaf node equal to "other" immediately preceding a leaf node equal to "name". Now, support for proximity ("rose*" within 10 words of "sweet") would simply be a matter of: LINE["rose*" %10 "sweet"] (The %N syntax is borrowed from our query language.) Higher level proximities could be done like this: LINE["name"] %10 LINE["purple"] The operator simply adopts the level of its operands mismatched operands constitute an error. Caveat: I confess that I don't know XQL very well at all, so I may be saying something completely different to what I intended with the above examples. Corrections are most welcome. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 29 02:51:17 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:45 2004 Subject: Proposed new kind of SAX2 thing, with example In-Reply-To: from "Lars Marius Garshol" at Mar 28, 99 12:07:29 pm Message-ID: <199903290049.TAA06821@locke.ccil.org> Lars Marius Garshol scripsit: > You can anyway, if you just use a Vector or some equivalent as the > property value. Vectors are no real substitute for indexed properties, because they require exposing the collection rather than just its elements, and the bean can't get control when an element is changed. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Mar 29 04:45:31 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:10:45 2004 Subject: xhtml and the p tag Message-ID: <004401be798d$ff8357e0$a2aedccf@ix.netcom.com> Hi, The following is an extract from the strict4.0 DTD In HTML 4.0 a paragraph can only contain an inline element, just the same as in XHTML Frank (speaking for myself) Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (Aug) ----- Original Message ----- From: Mark D. Anderson To: XML List Cc: Sent: Sunday, March 28, 1999 6:49 PM Subject: xhtml and the p tag >(not sure where xhtml discussion should go; all i see on the www-html@w3.org >list are stultifying discussions about tag case-sensitivity.) > >in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ , >the p element is %Inline, which means it can't include any >block level elements such as ul. So now we have a quandary. > >in practical terms, what i usually want is something >like a non-existent . That doesn't exist, because >in browsers a
breaks the line; it doesn't end the >paragraph, and particularly now with xhtml,

is >deprecated. But even if I *did* the extra work to wrap >

...

around my paragraphs, that still wouldn't >work, because a p can't enclose any block level elements >such as a ul. > >-mda > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Mar 29 04:54:57 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:10:45 2004 Subject: xhtml and the p tag Message-ID: <033c01be798f$c5570010$0200a8c0@mdaxke.mediacity.com> >In HTML 4.0 a paragraph can only contain an inline element, just the same as >in XHTML Right; html is broken too. But I've already given up on html :). I'm still curious what a better content model for paragraphs would be. (Sorry, i suppose this topic belongs somewhere else even if xhtml-related; suggestions are welcome.) -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Mar 29 05:24:04 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:46 2004 Subject: XHTML and character entities Message-ID: <36FEF173.F52E1751@jfinity.com> I mention tidy below but am asking about html->xhtml conversion in general. I use tidy to to convert html to xhtml using the -asxml switch. The result of many conversions is still not accepted as well-formed because entities like agrave and friends aren't defined unless you process the DTD. Wouldn't it be reasonable to convert these to character entities as part of the html->xhtml process? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Mar 29 10:46:23 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:46 2004 Subject: how to print the XML document in IE 5.0 Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1730@EUKBANT101> > -----Original Message----- > From: Derek Denny-Brown [SMTP:derekdb@microsoft.com] > > Not to be picky, but... The "Save-As" option in IE5 for XML documents > _does_ > save the XML. > You can be even pickier if you like - the entire email below was incorrect, and I appologise. I think this was the case for the betas though (that's probably wrong too!). Still, Mozilla's view source is nicer 'cos it's syntax highlighted... Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Matthew Sergeant (EML) [mailto:Matthew.Sergeant@eml.ericsson.se] > > It appears that IE5 converts internally to HTML (with the XSL style > sheet), > so the answer is that you can't. Even a save to disk saves the HTML AFAIK. > Try using Mozilla - it does things right, and displays XML+XSL remarkably > well considering it's at least 6 months away from release. > > > -----Original Message----- > > From: Kevin Hsu [SMTP:shyutz@ms1.hinet.net] > > > > Can anyone tell me how to print the XML document as I see on the screen > in > > IE 5.0, thanks in advance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Mar 29 10:55:09 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:10:46 2004 Subject: half-baked parsers vs binary XML Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1731@EUKBANT101> > -----Original Message----- > From: Gabe Beged-Dov [SMTP:begeddov@jfinity.com] > > Another reason (other than the binary XML thread) that I brought this up > was discussion on > the perl-xml mailing list of whether XML::Parser was usable for soft > real-time server side > processing. The consensus there seems to be no. > I think it's "Yes" - if you do it right. > XML::Parser is layered on expat. Anecdotal evidence seems to be that there > is an order of > magnitude performance advantage to "parsing" something other than XML. The > two alternatives > are a textual format that Perl can eval directly (Data::Dumper) and a > binary format > (Storable). > > In both cases (Data::Dumper and Storable) there is conversion from the > on-disk format to the > in-memory format. Why is XML so much slower according to developer > feedback? That is what I > was trying to understand from other peoples experience rather than doing a > hands-on analysis > myself. > > I may have jumped to the conclusion that it was the extra work that a > well-formedness > processor has to do over what a half-baked processor would do. That still > leaves the quesion > of where the slowdown is and whether it is an implementation issue or > inherent is some aspect > of XML parsing. > I think the real problem is that you're doing 2 stages of work with XML::Parser, as opposed to using Storable or Data::Dumper. With XML::Parser I'm reading the XML and searching (querying) for specific nodes within the XML. There's work there that has to be done in finding the nodes. If I could just call parsefile() without any extra work I think it would be fast enough. What I'm really doing, by using Storable is caching the parse+query phase. That should really be considered standard practice for any high performance system. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Mon Mar 29 11:52:01 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:46 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <03d001be79c9$25673220$5402a8c0@oren.capella.co.il> Stephen D. Williams wrote: >I don't understand how to use XSL in a general way to acheive a 'delta tree' architecture. I >have a vague idea, but nothing that I could see being automated sufficiently. Can you >elaborate? The following (from section 2.7.12 of the current XSL draft): Will copy all input to the output without modification. You can then add templates to do specific modifications. For example: NewValue Will take all 'TAG' elements in the input document which have an 'ATTR' attribute whose value is 'OldValue' and change its value to 'NewValue'. Given the power of XSL match patterns and the power of the construction elements, I think you can express any reasonable 'delta' on the input XML tree. Of course, this is outside the scope of the XSL intent as it stands today. The transformation part of XSL is just what we need for: - An XML query language. Think about it - an XML query language should (i) be XML; (ii) allow selecting arbitrary parts of the input XML document(s); (iii) allow constructing result XML document(s). The transformational part of XSL already does 80% of that. Does anyone consider making XQL a proper superset of XSL? Not a chance. Everyone is intent on creating a new language. XQL at least reuses the match pattern syntax, while inventing a new incompatible way of creating the results tree; XML-QL goes for broke and reinvents the whole thing. - A standard way to convert XML documents to legacy non-XML languages. Oops, I just said non-XML languages. Excuse me. - New and unexpected uses, such as the one above: expressing differences between XML trees (which by itself has a lot of interesting applications). But no, due to historical reasons XSL was created as part of a style language, so we'll just have to use a different language for each of the above uses and any new one which comes along (making sure they are incompatible, of course). Never mind that CSS is alive and kicking and supported by the very same W3C is another way of specifying style. Never mind that CSS is staying away from anything which might look like XML syntax, and is well along the way of inventing a new match pattern language of its own, whose only advantage over the XSL one is that it is incompatible with it. I'm sure it all makes sense for _someone_. Whatever the reasons are, what I see is "Job security for XML professionals for the next millennium". Sorry, I just had to get it off my chest :-) Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony.mcdonald at ncl.ac.uk Mon Mar 29 12:29:49 1999 From: tony.mcdonald at ncl.ac.uk (Tony McDonald) Date: Mon Jun 7 17:10:46 2004 Subject: SQL database table structure for encoding XML documents? Message-ID: Well, the subject says it all really. Does anyone have a structure that works for them that they're willing to share? ie ... CREATE TABLE ... ... Any pointers to other resources etc. would be gratefully received. TIA tone ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hb at ix.heise.de Mon Mar 29 13:25:55 1999 From: hb at ix.heise.de (hb@ix.heise.de) Date: Mon Jun 7 17:10:46 2004 Subject: Namespace Question References: Message-ID: <36FF62DA.82924B33@ix.heise.de> Hi, For a short example regarding namespaces I have used a variant of Tim's example in his XML.com article. Is it necessary (as I presume) to assign every single attribute as long as it is not from HTML? My Booklist
Dream a little dream of me Dr. Sigmund Freud
Best regards, Henning Behme iX - Magazin fuer professionelle Informationstechnik Helstorfer Str. 7 * 30625 Hannover * Germany http://www.heise.de/ix/ * +49 511 5352-374 * f: -361 ------ White, adj. and n. Black (Ambrose Bierce) ------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 13:27:33 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:46 2004 Subject: half-baked parsers vs binary XML In-Reply-To: <36FDA19A.27DA7B45@jfinity.com> References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> <36FD95F9.7E93A231@jfinity.com> <14077.38830.801250.747754@localhost.localdomain> <36FDA19A.27DA7B45@jfinity.com> Message-ID: <14079.25158.877601.734891@localhost.localdomain> Gabe Beged-Dov writes: > Another reason (other than the binary XML thread) that I brought > this up was discussion on the perl-xml mailing list of whether > XML::Parser was usable for soft real-time server side > processing. The consensus there seems to be no. The speed bottleneck, however, is Perl, not Expat: if you were acting off a different kind of input, it would still take just as long to execute the Perl handlers for the start and end of each element, etc. In other words, it's not the XML *input* that you need to optimize, but the *output* -- for example, if you have a Perl script that renders XML in HTML, the best speed optimization is to cache the result and reserve it for any request with the same parameters. The XML/SGML processing model is generally to walk through a document (as a collection of events or as a tree) and fire off handlers for different types of things. Even a short to medium-length XML document can cause the handlers to be fired off many thousands of times, and if you're trying to handle hundreds of requests per second, that's going to cause problems with or without XML. In some cases, the query processing model might help things, especially if the query code is moved into C or C++. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 13:41:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:46 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) In-Reply-To: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU> References: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14079.26243.658297.559636@localhost.localdomain> Rick Jelliffe writes: [...] > but Dave seems to be saying that the fact that two things (pointed to) > are the same is not "information". That seems an extraordinary claim. > > > john > rover > rex > > > encodes more information than > > > rover > rex > I don't think that I said that, though I certainly typed a lot. What I did say is that there's not a practical difference among the different alternatives in XML and SGML for expressing this information, and probably not enough to justify the parallel maintenance of the two as discrete standards. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 13:55:03 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:46 2004 Subject: Proposed new kind of SAX2 thing, with example In-Reply-To: <199903290049.TAA06821@locke.ccil.org> References: <199903290049.TAA06821@locke.ccil.org> Message-ID: <14079.27126.38040.640445@localhost.localdomain> John Cowan writes: > Lars Marius Garshol scripsit: > > > You can anyway, if you just use a Vector or some equivalent as the > > property value. > > Vectors are no real substitute for indexed properties, because they > require exposing the collection rather than just its elements, > and the bean can't get control when an element is changed. Either John is misinterpreting Lars or I am. I thought that Lars meant using a vector containing the index and the value, not a vector of all the possible values. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Mon Mar 29 15:40:09 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:10:46 2004 Subject: Interface name quandry again ... Message-ID: A while ago I posted to this list asking for a suggestion for a Java package name for interfaces and classes that deal with stuff in the intersection of xml and html, but which aren't sufficiently general to cover all of sgml. Someone (John Cowan, I think) suggested xhtml. That seemed like quite a good idea at the time, but since then voyager has been renamed. I'm not a big fan of overloading, so I've been scratching my head trying to think of something else. So far I've come up with precisely zilch. If anyone can help me out I'd be very grateful. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Mon Mar 29 15:42:00 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:46 2004 Subject: Java(TM)/XML(TM) Open Source Development Session Message-ID: <3.0.3.32.19990329084134.03898920@pop.mindspring.com> ExoLab is hosting open source software development sessions for Java and XML technologies. This description arrived in my inbox, and I thought it might be interesting to some people here. Jonathan Sender: root@hermes.oceanet.fr Date: Mon, 29 Mar 1999 14:26:28 +0200 From: "Isma?l Ghalimi" Organization: ExOffice, Inc. Subject: Java(TM)/XML(TM) Open Source Development Session Hi, We are pleased to announce the ExoLab, first Open Source Development Session dedicated to Java(TM) & XML (TM) technologies. The ExoLab will host Open Source software development sessions during periods ranging from one week to one month. These sessions will be open to consulting and software companies aiming at collaboratively work on the development of Open Source software. ExoLab Sessions will help consulting and software companies to share their knowledge about cutting-edge Open Source technologies thus allowing powerful technology transfers between companies having an Open Source business model. The first ExoLab Session will take place in Nantes (Loire-Atlantique, FRANCE) during May 1999 and will be mainly targeted at the development of the ExoGen Framework, an Open Source Java(TM)-based application & document server. More information about this initiative can be found on the ExoLab Home Page: http://www.exoffice.com/exolab.html The following developers will be present: * One engineer from Lutris Technologies http://www.lutris.com * Two engineers from SMB, the author of the Open Source Ozone OODBMS http://www.softwarebuero.de * The author of SPFC http://java.apache.org/spfc/index.html * The author of OpenXML http://www.openxml.org * Three core developers from the Java Apache Project http://java.apache.org * The author of XSL:P http://www.clc-marketing.com/xslp/ * The architect of ejboss http://www.ejboss.org All the developments done during these sessions will be licensed under the LGPL, the ALL, or a BSD-like license. It will allow the integration of these developments into commercial binaries without having to redistribute under any Open source license any modification done on it. Please contact me at ghalimi@exoffice.com for more information. Best regards Isma?l Ghalimi, CEO ExOffice, Inc. ghalimi@exoffice.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Mon Mar 29 15:50:32 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:46 2004 Subject: convertor generator available Message-ID: <36FF848C.75E4@skynet.be> Version 0.0.1 of masterplan, a convertor generator is now available from http://users.skynet.be/mp/mp001.tar.Z some examples are included. You'll need gcc, bison and flex or equivalents to build both the executable and the convertors. Masterplan comes as a 'kit' of GPL-ed core code and less restricted library code, so the convertors themselves aren't infected by the GPL. Have fun. Paul Janssens - paul.janssens@skynet.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Mon Mar 29 16:02:31 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:46 2004 Subject: SQL database table structure for encoding XML documents? References: Message-ID: <36FF8768.4F16@skynet.be> Tony McDonald wrote: > > Well, the subject says it all really. > > Does anyone have a structure that works for them that they're willing > to share? ie No, but from the top of my head: ENTITYNAMES tag, name ENTITY uniquekey, tag,parententitykey,index ATTRIBUTE ownerentitykey,name, value CONTENT ownerentitykey,index, data You don't have validation, but you do have referential integrity, and you can render back to XML with a simple transitive closure. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Mar 29 17:28:51 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:10:46 2004 Subject: Fast filter support in SAX2 References: <001601be7886$f9db4440$c8a8a8c0@thing1> <14077.15323.327681.132673@localhost.localdomain> Message-ID: <36FF9B91.A0632786@infinet.com> David Megginson wrote: > Bill la Forge writes: > > > It would be great if filters had the same advantages as parsers in > > being able to simply test for equality (x==y) rather than having to > > do a string comparison (x.equals(y)) when checking for a specific > > element or attribute name. > > Yes, but as someone (James Clark?) pointed out during the last round, > with most serious applications you're going to end up doing hash > lookups anyway, so the == doesn't buy you much. That depends on your implementation of a hash table. Also as of JDK 1.1.6 the equals method for strings first tests for identity of the two string objects and then tests to see if the length is the same and then tests for matching of each character in each string. When dealing with names in XML they are uniformly nothing more than symbols so in application code being able to do something like this: if (x == "foo") is generally much faster than: if (x.equals("foo")) as you do not incur the overhead of calling one dynamic method. Really it depends on your code. In an XML related technology I worked on I had lots of if-else statements that did exactly this. The parser I used presented the strings to the application as interned strings and did significantly improve performance from using the equals method approach. Another thing that I used for speeding up my applications is to have a special hash table for interned strings. Basically all that this table did was use System.identityHashcode() instead of String.hashcode() to get a hash for the string. In effect you use the Object.hashCode() implementation. It also depends a lot on your VM. Some VM's are good enough with dynamic method invocation that the difference between testing for string identity and string equality is neglibible. The so-called Hotspot VM may even inline String.equals() into your code. I suggest using the identity approach if possible as it is easier to read and maintain IMHO and in the general case you may get significant speedups if your application does many string comparisons. If you need a faster hash table for strings build one yourself. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Mon Mar 29 17:45:53 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:46 2004 Subject: Fw: Is there anyone working on a binary version of XML? Message-ID: <045d01be79fa$91f833e0$5402a8c0@oren.capella.co.il> Stephen D. Williams wrote: >I don't understand how to use XSL in a general way to acheive a 'delta tree' architecture. I >have a vague idea, but nothing that I could see being automated sufficiently. Can you >elaborate? The following (from section 2.7.12 of the current XSL draft): Will copy all input to the output without modification. You can then add templates to do specific modifications. For example: NewValue Will take all 'TAG' elements in the input document which have an 'ATTR' attribute whose value is 'OldValue' and change its value to 'NewValue'. Given the power of XSL match patterns and the power of the construction elements, I think you can express any reasonable 'delta' on the input XML tree. Of course, this is outside the scope of the XSL intent as it stands today. The transformation part of XSL is just what we need for: - An XML query language. Think about it - an XML query language should (i) be XML; (ii) allow selecting arbitrary parts of the input XML document(s); (iii) allow constructing result XML document(s). The transformational part of XSL already does 80% of that. Does anyone consider making XQL a proper superset of XSL? Not a chance. Everyone is intent on creating a new language. XQL at least reuses the match pattern syntax, while inventing a new incompatible way of creating the results tree; XML-QL goes for broke and reinvents the whole thing. - A standard way to convert XML documents to legacy non-XML languages. Oops, I just said non-XML languages. Excuse me. - New and unexpected uses, such as the one above: expressing differences between XML trees (which by itself has a lot of interesting applications). But no, due to historical reasons XSL was created as part of a style language, so we'll just have to use a different language for each of the above uses and any new one which comes along (making sure they are incompatible, of course). Never mind that CSS is alive and kicking and supported by the very same W3C is another way of specifying style. Never mind that CSS is staying away from anything which might look like XML syntax, and is well along the way of inventing a new match pattern language of its own, whose only advantage over the XSL one is that it is incompatible with it. I'm sure it all makes sense for _someone_. Whatever the reasons are, what I see is "Job security for XML professionals for the next millennium". Sorry, I just had to get it off my chest :-) Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From timm at channelpoint.com Mon Mar 29 18:20:20 1999 From: timm at channelpoint.com (Tim McCune) Date: Mon Jun 7 17:10:46 2004 Subject: OFF: (waaay off topic) RE: LISTADMIN: No attachments to list mess ages PLEASE Message-ID: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com> Damned eloquent David. But I'd put the poster at #1 on that list for being ignorant enough to open a Word document that was attached to an e-mail message. Your comment about technical diversity indicates to me that you've never been a system administrator. ;) -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of David Megginson As became clear in the follow-ups, the posting was done by a worm that hides in Word macros (the Internet's equivalent of animal dung, apparently) exploits gaping security holes in Outlook to mail itself out to everyone in a person's address list. In other words, the original poster did *not* post the attachment to xml-dev, the worm did. His only mistakes were (a) using Microsoft Windows, (b) opening a file in MS Word, and (c) not uninstalling Outlook from his computer the first time he booted up. If you had summarily unsubscribed him, then you would simply have added an unjust punishment to the embarrassment he was already suffering. In fact, all three of the mistakes were probably mandated by company policy; if so the true blame belongs in three places, in diminishing order of culpability: 1. The poster's company, for ignoring the importance of technical diversity and mandating the same operating system and software for everyone (it's much easier to write a worm or virus when everyone's using exactly the same software). 2. Redmond, for ignoring security whenever possible. 3. The creator of the worm. If I'm right about corporate policy, then most of the blame goes to the company -- Redmond just wants to sell software, and the worm creator just wants attention, but the company failed to act in its own self-interest. Technical diversity is critical for good operation: I'd no more want to see an all-Linux shop than I'd want to see an all-Windows or an all-Mac shop. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Mar 29 18:29:27 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:10:46 2004 Subject: XHTML and character entities Message-ID: <005c01be7a01$0141d480$1eacdccf@ix.netcom.com> Actually the best thing would be to convert them all to numeric entities, and then the problem wouldn't arise frank ----- Original Message ----- From: Gabe Beged-Dov To: XML List Sent: Sunday, March 28, 1999 10:20 PM Subject: XHTML and character entities >I mention tidy below but am asking about html->xhtml conversion in >general. > >I use tidy to to convert html to xhtml using the -asxml switch. The >result of many conversions is still not accepted as well-formed because >entities like agrave and friends aren't defined unless you process the >DTD. > >Wouldn't it be reasonable to convert these to character entities as part >of the html->xhtml process? > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Mon Mar 29 18:48:29 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:46 2004 Subject: XML query language References: <045d01be79fa$91f833e0$5402a8c0@oren.capella.co.il> Message-ID: <36FFAE41.2780@skynet.be> Oren Ben-Kiki wrote: > - An XML query language. Think about it - an XML query language should (i) > be XML; (ii) allow selecting arbitrary parts of the input XML document(s); > (iii) allow constructing result XML document(s). I think (iii) should not be a requirement of an XML query language. The result of a query could be a vector of tuples of pointers to the individual matches. Whatever needs to be done with that output can be done in a layer above that. Just because SQL mixes content with style doesn't mean an XML query language should. Paul Janssens - paul.janssens@skynet.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nhs at llnl.gov Mon Mar 29 19:02:44 1999 From: nhs at llnl.gov (Norman H. Samuelson) Date: Mon Jun 7 17:10:47 2004 Subject: XML to Text questions Message-ID: <4.1.19990329084312.00aae3f0@popeye.llnl.gov> What tools are available for translation of XML into text? We are working on a GUI that will write the information needed as input to a physics simulation code in XML, and we need to translate that into the grammar required by the physics code. Our goal is a GUI that will work for many different physics codes. The translators are necessary because we do not want to change the physics simulation at this time to read XML directly. - Norm - Norman H. Samuelson nhs@llnl.gov Lawrence Livermore National Lab 925-422-0661 P.O. Box 808, L-98 Livermore, CA 94551 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Mar 29 19:30:04 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:47 2004 Subject: XHTML and character entities References: <005c01be7a01$0141d480$1eacdccf@ix.netcom.com> Message-ID: <36FFB7A8.5D81864@jfinity.com> Frank Boumphrey wrote: > Actually the best thing would be to convert them all to numeric entities, > and then the problem wouldn't arise That is what I meant by character entity. I should have said character reference. Converting the general entity references to character references is what I was trying to ask about. I.e. is it reasonable for a html->xhtml converter to do this automagically or should it be an option, etc.. A further question would be when Tidy would start doing it :-? Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Thompson at dresdnerkb.com Mon Mar 29 19:46:34 1999 From: James.Thompson at dresdnerkb.com (James.Thompson@dresdnerkb.com) Date: Mon Jun 7 17:10:47 2004 Subject: XSL Transformation Message-ID: <199903291742.SAA19675@harpo.dresdnerkb.com> Hi, I have an XML doc that looks like this A 123 A 456 B 789 I would like to transform it into this kind of structure using XSL: 123 456 789 I don't know the categories in advance, and there are also sub cats that will nest within the categories. I could do it using scripts and some kind of fudge based on the SQL SELECT DISTINCT category idea. However, I think this is somewhat against the spirit of XSL. Any ideas on how this might be done? I can't be this first person to want to do this kind idiom. Many Thanks James Thompson ########################################## This email, its content and any files transmitted with it are intended solely for the addressee(s) and may be legally privileged and/or confidential. Access by any other party is unauthorised without the express written permission of the sender. If you have received this email in error you may not copy or use the contents, attachments or information in any way. Please destroy it and contact the sender on the number printed above, via the Dresdner Kleinwort Benson switchboard on +44 171 623 8000 or via e-mail return. Internet communications are not secure unless protected using strong cryptography. This email has been prepared using information believed by the author to be reliable and accurate, but Dresdner Kleinwort Benson makes no warranty as to accuracy or completeness. In particular Dresdner Kleinwort Benson does not accept responsibility for changes made to this email after it was sent. Any opinions expressed in this document are those of the author and do not necessarily reflect the opinions of the Bank or its affiliates. They may be subject to change without notice. ########################################## xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Mar 29 19:51:23 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:47 2004 Subject: half-baked parsers vs binary XML References: <5F052F2A01FBD11184F00008C7A4A800022A1731@EUKBANT101> Message-ID: <36FFBCB4.15CA1B0B@jfinity.com> Matthew Sergeant (EML) wrote: > If I could just call parsefile() without any extra work I think it would be fast > enough. As Nathan Kurz's posting to the perl-xml shows, there is a bottleneck in just the parsing of the XML without bringing callback firing, let alone query processing into the picture. > What I'm really doing, by using Storable is caching the parse+query phase. This is great if your use-case supports it. It is not a general purpose approach to providing scaleable performance for soft real time systems that want to incorporate XML parsing. > That should really be considered standard practice for any high > performance system. Once again, I would say that if your "high performance" system can architected using a "cache the parse+query" approach and the complexity and storage overheads are acceptable, go for it. There are alot of "high performance" systems that wont be amenable to this approach. Gabe Beged-Dov www.jfinity.com > > > Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Mar 29 20:02:32 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:10:47 2004 Subject: half-baked parsers vs binary XML References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> <36FD95F9.7E93A231@jfinity.com> <14077.38830.801250.747754@localhost.localdomain> <36FDA19A.27DA7B45@jfinity.com> <14079.25158.877601.734891@localhost.localdomain> Message-ID: <36FFBF61.5307A689@jfinity.com> David Megginson wrote: > In other words, it's not the XML *input* that you need to optimize, > but the *output* -- for example, if you have a Perl script that > renders XML in HTML, the best speed optimization is to cache the > result and reserve it for any request with the same parameters. Assume that caching isn't an option. I.e. you have to make all your processing reasonably fast. Its not acceptable to make 80% of your processing really fast. > The XML/SGML processing model is generally to walk through a document > (as a collection of events or as a tree) and fire off handlers for > different types of things. Even a short to medium-length XML document > can cause the handlers to be fired off many thousands of times, and if > you're trying to handle hundreds of requests per second, that's going > to cause problems with or without XML. Are we talking about throughput or responsiveness? It would be useful to bring up some use-cases where XML processing can't be employed using the default handler firing model and try to understand what the alternatives are. Matt Sergeant has brought up one that he might be able to flesh out involving large scale usage. I'm sure there are others. Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 20:03:05 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:47 2004 Subject: Why Technical Diversity Matters (was OFF: (waaay off topic)) In-Reply-To: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com> References: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com> Message-ID: <14079.47860.97820.196660@localhost.localdomain> Tim McCune writes: > Damned eloquent David. But I'd put the poster at #1 on that list > for being ignorant enough to open a Word document that was attached > to an e-mail message. You cannot expect typical users to make an informed decision about software security risks (some can certainly do so, but it is not a reasonable expectation in general). > Your comment about technical diversity indicates to me that you've > never been a system administrator. ;) I've had budgetary responsibility for system administrators, and have hired and supervised them, so I do understand why it is so tempting to go for technical homogeneity rather than technical diversity. In the end, however, it's actually just bad business. This is not a problem that is specific to computers: it's a general business cost/risk tradeoff. To get away from the anti-Windows hype, imagine that you run a mid-sized, regional air carrier with all your routes and passenger loads about the same: you will save an *enormous* amount of money in training, maintenance, staff, facilities, etc. if you buy all of your planes from the same manufacturer (and preferably, if you buy the same model). Now, let's say that you bought a fleet of 15 A320's from Airbus, and they run beautifully for seven years. Suddenly, there's a major crash involving an A320 from another airline a month before Christmas, and the FAA grounds all planes of that model until their investigation is finished. The investigation finishes in mid-January and your A320's get a clean bill of health, but now you've not only missed the Christmas rush (which accounts for a large part of your annual revenue) and destroyed employee moral (by laying most of them off just before Christmas), but you've upset your customers, who had to switch to other airlines and wait at the back of the line. MORAL ----- When you decided to save money by buying all of your planes from the same manufacturer, you were actually doing the opposite of buying insurance: with insurance, you trade a fixed cost (your insurance premiums) for a non-fixed benefit (avoiding a large, unexpected liability); with technical homogenity, you trade a non-fixed cost (the possibility of a complete operations shutdown of indeterminate length) for a fixed benefit (a known reduction in the cost of ownership). It isn't hard to see how the same point applies to computing, no matter how good or competent a specific manufacturer is. In the end, some businesses may decide to take this risk, but they should at least do it in an informed way (i.e. realise that it's a risk) and protect themselves with some sort of derivatives or supplementary insurance. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Mar 29 20:03:47 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:47 2004 Subject: XML to Text questions In-Reply-To: <4.1.19990329084312.00aae3f0@popeye.llnl.gov> Message-ID: <199903291803.NAA15557@hesketh.net> At 08:58 AM 3/29/99 -0800, Norman H. Samuelson wrote: >What tools are available for translation of XML into text? > >We are working on a GUI that will write the information needed as input to >a physics simulation code in XML, and we need to translate that into the >grammar required by the physics code. > >Our goal is a GUI that will work for many different physics codes. The >translators are necessary because we do not want to change the physics >simulation at this time to read XML directly. You'd have to do some 'roll-your-own' work right now, but it probably wouldn't be very difficult to write a SAX application (in Java) that does what you need, taking the events generated by parsing XML and converting that information into the text format you need. The MDSAX library has some display (really output generation) tools that might simplify managing that process, but you (or another lucky programmer) could probably write a fairly simple application as long as your XML and your final text output have similar structures. More on SAX - http://www.megginson.com/SAX/ More on MDSAX - http://www.jxml.com/mdsax Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 20:11:56 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:47 2004 Subject: XML to Text questions In-Reply-To: <4.1.19990329084312.00aae3f0@popeye.llnl.gov> References: <4.1.19990329084312.00aae3f0@popeye.llnl.gov> Message-ID: <14079.49440.827275.773899@localhost.localdomain> Norman H. Samuelson writes: > What tools are available for translation of XML into text? > We are working on a GUI that will write the information needed as input to > a physics simulation code in XML, and we need to translate that into the > grammar required by the physics code. You can write quick one-off 10-line scripts with Perl, Python, and probably many other scripting languages. A Java app takes a little more work, but the clean object-oriented structure allows you to do harder things without going completely insane, and there are an awful lot of good, higher-level XML libraries for Java (though the Python and Perl collections are growing fast). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Mon Mar 29 20:12:40 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:47 2004 Subject: Whence XQL? In-Reply-To: <19990329104719.A13271@io.mds.rmit.edu.au> References: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <19990326133124.B7318@io.mds.rmit.edu.au> <3.0.3.32.19990325220915.032a1480@pop.mindspring.com> Message-ID: <3.0.3.32.19990329131150.00d0b710@pop.mindspring.com> At 10:47 AM 3/29/99 +1000, Marcelo Cantos wrote: >On Thu, Mar 25, 1999 at 10:09:15PM -0500, Jonathan Robie wrote: >> Cool, you work on SIM? (Does that make you a SIMian?) > >Cute! It might just take off around here. :-) I haven't been able to come up with a similar nickname for people who work on XQL... >I do wonder what proportion of people looking seriously at XQL are >into text. We find WITHIN N to be exceedingly useful. It is also >interesting to note that we only offer proximity at the word level and >that this is all clients ever really want. We do also offer same >sentence/paragraph queries, but virtually no-one uses them. One full-text search engine vendor told me that their users did not use proximity searching. This surprised me, but it was what convinced me that I might be able to leave proximity out of even full-text extensions to XQL. Most of what I have done with XML until fairly recently was with structured documents rather than data, or with documents that also contain what has classically been considered data. I am now starting to do more with XML for data. I think that both Microsoft and Joe Lapp of webMethods have worked more with data than with documents. >It's an interesting angle, though not one I had considered (not that I >have considered many angles :-). I had understood, perhaps >incorrectly, that the only way to perform word-level boolean queries >was to treat words abstractly as leaf nodes of the document tree >rather than clumps of opaque string data. Under this conception, to >find "other name", one would say: > > LINE[WORD="other"; WORD="name"] > >It could possibly be made legal to abbreviate the above to: > > LINE["other"; "name"] XQL as-is does not allow this, but I have discussed this as a possible extension in the section on "Integrating structured and full-text queries", in http://www.w3.org/TandS/QL/QL98/pp/murata-san.html, a paper written together with Makoto Murata-san. It makes the above syntax legal. The other approach, which you have used above, is to pretend that there is markup identifying the individual words - that's a perfectly valid approach too. >Which would be interpreted as, "a Line element which is the parent of >a leaf node equal to "other" immediately preceding a leaf node equal >to "name". Now, support for proximity ("rose*" within 10 words of >"sweet") would simply be a matter of: > > LINE["rose*" %10 "sweet"] > >(The %N syntax is borrowed from our query language.) Higher level >proximities could be done like this: > > LINE["name"] %10 LINE["purple"] > >The operator simply adopts the level of its operands mismatched >operands constitute an error. I would have to think about how to fit that into the XQL grammar. Does it have advantages over the function-based approach I suggested earlier? near("name", "purple", 10) This fits into the XQL grammar without modification, it's just a matter of introducing another function. Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Mon Mar 29 20:18:12 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:10:47 2004 Subject: ANNOUNCE: XQL processor in Java Message-ID: <199903291816.UAA24037@sonne.darmstadt.gmd.de> GMD-IPSI is pleased to announce Java based implementations of the XQL language and a persistent W3C-DOM. The GMD-IPSI XQL engine [1] is a Java based storage and query application for large XML documents. The functionality may be accessed via command line invocation or the Java API. The engine consists of two main parts: 1. A persistent implementation of the W3C-DOM 2. A full implementation of the XQL language The XQL engine implements the W3C-QL '98 workshop paper syntax of XQL. It uses a novel indexing algorithm for XML (publication pending), which indexes the document while processing the first query. Subsequent queries to the same document are considerably accelerated. The persistent DOM implements the W3C-DOM interfaces on indexed, binary XML files. Documents are parsed once and are stored in this form, accessible to DOM calls without the overhead of parsing them first. A cache architecture additionally increases performance. At this time only read access is possible, support of the full W3C-DOM API is work in progress. The GMD-IPSI XQL engine was developed as a research project in GMD's XML competence center by Gerald Huck [2], with contributions by Ingo Macherius [3]. It is free for non-commercial use and evaluation, see the download page for details. For commercial requests contact the main author. [1] http://xml.darmstadt.gmd.de/xql/ [2] mailto:huck@gmd.de [3] mailto:macherius@gmd.de -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 29 20:35:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:47 2004 Subject: half-baked parsers vs binary XML In-Reply-To: <36FFBF61.5307A689@jfinity.com> References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain> <36FD95F9.7E93A231@jfinity.com> <14077.38830.801250.747754@localhost.localdomain> <36FDA19A.27DA7B45@jfinity.com> <14079.25158.877601.734891@localhost.localdomain> <36FFBF61.5307A689@jfinity.com> Message-ID: <14079.51018.128742.97025@localhost.localdomain> Gabe Beged-Dov writes: > > The XML/SGML processing model is generally to walk through a document > > (as a collection of events or as a tree) and fire off handlers for > > different types of things. Even a short to medium-length XML document > > can cause the handlers to be fired off many thousands of times, and if > > you're trying to handle hundreds of requests per second, that's going > > to cause problems with or without XML. > > Are we talking about throughput or responsiveness? It would be > useful to bring up some use-cases where XML processing can't be > employed using the default handler firing model and try to > understand what the alternatives are. I'm talking about throughput -- using a persistent interpreter (like mod_perl) rather than a CGI can solve most of the responsiveness problems. The difficulty is just that firing off so much Perl code is (in Perl's current design) slow. The original posting suggested using a binary format because parsing XML with Expat is slow, but in fact, Expat and the actual XML parsing turn out not to be a bottleneck. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fmclain at cdgpd.com Mon Mar 29 20:38:19 1999 From: fmclain at cdgpd.com (Fred McLain) Date: Mon Jun 7 17:10:47 2004 Subject: (waaay off topic) RE: LISTADMIN: No attachments to list mess ages PLEASE Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3DE@rainier.cdgpd.com> Tim, I can understand the desire to point fingers over this. I'm fairly well versed in security matters and I don't see why I would have been alerted to this virus. The mail I read came from a trusted source - our product manager, and was from internal e-mail, not the internet. Under those circumstances it seemed appropriate for me to open the e-mail attachment. As I'm sure you are aware, once the macro was running I had no control over it resending itself to this list. Personally I feel the fault was with MS Word and MS Outlook. If these programs did not allow a macro program over e-mail to control both Outlook and Word then this could not have happened. Furthermore the only alert you get when a potentially dangerous macro is being run by word is the macro warning message, outlook doesn't even bother to warn about embedded macros in word documents. The macro warning message is one I see every time I create a new document and a great many times when I read one. If you cry wolf often enough, you get ignored. -Fred- ------------------------------------- Fred McLain, Senior Technical Advisor Continental DataGraphics, Bellevue WA ------------------------------------- -----Original Message----- From: Tim McCune [mailto:timm@channelpoint.com] Sent: Monday, March 29, 1999 8:20 AM To: 'David Megginson'; 'XML Developers' List' Subject: OFF: (waaay off topic) RE: LISTADMIN: No attachments to list mess ages PLEASE Damned eloquent David. But I'd put the poster at #1 on that list for being ignorant enough to open a Word document that was attached to an e-mail message. Your comment about technical diversity indicates to me that you've never been a system administrator. ;) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Mar 29 20:53:49 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:47 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <87256743.0062192B.00@d53mta03h.boulder.ibm.com> >Imagine that you have all the features of XML: structure, flexibility, common format for >interchange, but that you perform zero processing steps to import or export the 'document' >from a program. (Actually, I'm thinking this would be done in chunks, but essentially very >few reads and writes.) > Actually, to be fair, there would be a somewhat non-trivial amount of bit fiddlin' to get it out of whatever canonical binary format you put it in, into the local byte order, floating point representation, byte boundary alignment, etc... Though hopefully that couldn't be any worse than parsing :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Mon Mar 29 21:10:31 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:10:47 2004 Subject: xhtml and the p tag Message-ID: <77A952A6B467D211855D00805F9521F114938A@cnet10.cnet.com> Mark, This has not changed from HTML 4.0. All of your paragraphs in XHTML documents should be enclosed with

...

element delimiters. Since the

element is itself a block level element, it cannot itself contain any block level elements. The construction

has to my knowledge never been acceptable markup in either HTML or XHTML documents. Regards, D- (speaking for myself, rather than any working group or corporation) > -----Original Message----- > From: Mark D. Anderson [mailto:mda@discerning.com] > Sent: Sunday, March 28, 1999 3:50 PM > To: XML List > Cc: dsr@w3.org > Subject: xhtml and the p tag > > > (not sure where xhtml discussion should go; all i see on the > www-html@w3.org > list are stultifying discussions about tag case-sensitivity.) > > in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ , > the p element is %Inline, which means it can't include any > block level elements such as ul. So now we have a quandary. > > in practical terms, what i usually want is something > like a non-existent . That doesn't exist, because > in browsers a
breaks the line; it doesn't end the > paragraph, and particularly now with xhtml,

is > deprecated. But even if I *did* the extra work to wrap >

...

around my paragraphs, that still wouldn't > work, because a p can't enclose any block level elements > such as a ul. > > -mda > > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Mon Mar 29 21:11:55 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:10:47 2004 Subject: ANNOUNCE: XQL processor in Java In-Reply-To: <199903291816.UAA24037@sonne.darmstadt.gmd.de> Message-ID: <3.0.3.32.19990329141022.00cec100@pop.mindspring.com> At 08:21 PM 3/29/99 +0200, Ingo Macherius wrote: >GMD-IPSI is pleased to announce Java based implementations >of the XQL language and a persistent W3C-DOM. Cool! I'm delighted. Jonathan -- Jonathan Robie R&D Fellow, Software AG jonathan.robie@sagus.com <- this address will be active Monday xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Mon Mar 29 21:22:20 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:10:48 2004 Subject: Namespace Question Message-ID: <77A952A6B467D211855D00805F9521F114938B@cnet10.cnet.com> Hi, Your question and example are seemingly incomplete; what kind of document is this with which you are attempting to use XML Namespaces? If it is intended to be an XHTML document, it needs an XML PI like so: . I'm asking not to be a smartass but because it is hard to answer your question otherwise. If your document is intended to be HTML 4.0 as one of your xmlns attribute values suggests, then using XML Namespaces is totally inappropriate; HTML 4.0 documents are not XML documents, and XML Namespaces cannot be used in any way. If the example is an xhtml 1.0 document (I'm assuming that it is) then the answer to your question is this: an element with an appropriate Namespaces prefix does not need to have the prefix attached to each of its attributes, because the scope for that element is defined by the element name prefix. If you use an attribute from a Namespace that differs from the Namespace of the element on which the attribute appears, then you must prefix it properly. Hope this helps, Regards, D- > -----Original Message----- > From: hb@ix.heise.de [mailto:hb@ix.heise.de] > Sent: Monday, March 29, 1999 3:24 AM > To: xml-dev@ic.ac.uk; hb@ix.heise.de > Subject: Namespace Question > > > Hi, > > For a short example regarding namespaces I have used a > variant of Tim's > example in his XML.com article. > > Is it necessary (as I presume) to assign every single > attribute as long > as it is not from HTML? > > xmlns:b="http://www.my.server.de/book" > xmlns:p="http://www.my.server.de/person"> > My Booklist > > > > >
> > class="important">Dream a little dream of > me > Dr. > Sigmund > Freud
> > > > Best regards, > > Henning Behme > > iX - Magazin fuer professionelle Informationstechnik > Helstorfer Str. 7 * 30625 Hannover * Germany > http://www.heise.de/ix/ * +49 511 5352-374 * f: -361 > ------ White, adj. and n. Black (Ambrose Bierce) ------ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Mar 29 22:31:28 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:48 2004 Subject: A Simple Thought Message-ID: <87256743.006AF60F.00@d53mta03h.boulder.ibm.com> >Ahh, there's the trick. I believe I have most of a design for an data structure >that is fast in memory yet is 'flat' and can have its chunks just written out or >read in at any point. It builds on some very old ideas I came up with for a >language I designed. When viewed as an interchange format, it may not be the >most optimal space wise (although it should be better than XML text) but trades >a small amount of space for nearly zero processing overhead. There will >probably also be a procedure for 'compacting' an object for storage into a >database or sending over a slow link vs. the 'fast' format usable between >servers in a cluster. > I think though that this would only hold up as long as you are looking at XML data as a read-only data source. Once you started doing significant editing of the data, having a flat structure like that would be more of a hinderance than a help, would it not? What if I have a 10MB flat buffer and want to add another child to the second element? This kind of gets into the quandry that you've nailed in one nail, but now its even harder to nail a whole raft of others as well as with the more general purpose mechanisms. I dunno, if I were thinking along these lines, to keep it reasonably portable, I'd look at the binary format as a fast serialization mechanism and at least create native language objects for each one. By the time you put enough stream format markers and whatnot into the stream to know where things are, and interpret those during runtime, it might be just as fast to pay the cost for creating a much more flexible, native object format for in memory manipulation. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 29 23:17:54 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:10:48 2004 Subject: Extensible Protocol implementation in Java Message-ID: <36FFEDC5.64D49FB4@thinlink.com> I have written a free Java implementation of Extensible Protocol, a pure-XML protocol for sending and receiving XML documents on a persistent connection. Interested folks can find out more at http://www.thinlink.com/xp Comments are most welcome. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Mar 29 23:50:44 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:48 2004 Subject: xhtml and the p tag References: <77A952A6B467D211855D00805F9521F114938A@cnet10.cnet.com> Message-ID: <36FFF4B5.D850C401@w3.org> Daniel Austin wrote: > > Mark, > > This has not changed from HTML 4.0. Or 3.2 or 2.0 > All of your paragraphs in XHTML > documents should be enclosed with

...

> element delimiters. Yes. Explicitly. In theory, they were enclodsed in them implicitly with HTML <=4.0 through the magic of SGML omissible end tags. In practice, though, browsers did not correctly infer missing end tags (or indeed omitted start tags) thus leading to the well known disparities in HTML "parsing" which became abundantly obvious with the rise in use of CSS and DOM (both of which require a parse tree, preferable the correct parse tree). > Since the

element is itself a block level element, > it cannot itself contain any block level elements. Like body, ul, ol, dl and div ? These are block level elements and can contain other block level elements. > The construction

has to my knowledge never been acceptable markup in > either HTML or XHTML documents. True. In an XML document instance which used HTML element names,

would be quite fine for an empty paragraph, in a well formed document. XHTML warns agains using it only if you are trying to fool existing HTML browsers into accepting your XML as if it were HTML. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Mar 30 00:54:53 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:48 2004 Subject: SAX2: Proposed alternative DTD interface Message-ID: <87256743.0078341A.00@d53mta03h.boulder.ibm.com> >Here's another alternative for SAX2: forget about trying to report DTD >declarations as events, and simply make the whole DTD available >through an interface with a Parser2.get() call. > >I threw together a quick (read-only) DTD interface this morning, and >uploaded it to the following location > But, what would you use for the form of the DTD? Its almost certainly not going to be stored in that way internally in the parser's pools, i.e. it would most likely be much more optimized (or even just different for whatever reasons.) So you would either have to totally translate all of that into some instance of your DTD class, or you would have to make the DTD object just a call through to get the data from the parser. However, the latter scheme has problems if you want to reuse the parser instance because now you've tied an instance of the DTD access object to an instance of the parser and you cannot reuse the parser without frying the DTD access object (and you have have no idea how long people might want to hang onto that info.) The same issue kind of happens with any DOM DTD access that might happen down the road. If the DOM stores the element/entity/etc... stuff in its own form its going to be redundant since that data is already in the parser. However, the DOM implementation doesn't want to be tied to any particular parser implementation really so you kind of have to store it redundantly to avoid other issues. If you are going to store in some other format, and that is done at a SAX like level, then you still need event APIs to come out of the parser to fill in the SAX DTD object that you are going to give back, right? Hopefully this is a coherent response. I got multiply deeply nested interrupts while trying to write it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Mar 30 01:13:54 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:48 2004 Subject: SAX2: DTDDeclHandler (minimalist position) Message-ID: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com> >> This creates four menmonic constants you want and gives them a checkable >> type. New constants can't be created because of the private constructor. >> And there's no chance that anybody's going to write code like >> >> if (getAttributeStatus() == 1) { >> doSomething(); >> } >> >> Programmers are more or less forced to use the constants. What do you >> think? > >I personally take a very dim view of systems trying to "force" programmers >into intrinsically good practices. Programmers can abuse any system you >present, and at some point you have to accept that they are adults, and must >be free to cut off their own noses if they wish. > >The good programming practice of replacing "magic numbers" with descriptive >constants is even older than the structured programming movement, and any >programmer who writes > But that's not really the point I don't think. The point isn't "if you are as macho a programmer as me you don't need any help". The point is that we work in a commercial environment and every single semantic that can be expressed in the code itself, so that the compiler can tell you when break them, is a Very Goode Thinge. It does no good at all to have a named constant if you can accidentally pass that named constant to 150 other things for which its not intended and the compiler cannot catch it. Its a fundamental lacking in Java that makes me shudder to think that people actually want to do serious work in it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Mar 30 01:40:50 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:48 2004 Subject: Is there anyone working on a binary version of XML? References: <87256743.0062192B.00@d53mta03h.boulder.ibm.com> Message-ID: <3700177B.AC5BFE07@lig.net> roddey@us.ibm.com wrote: > >Imagine that you have all the features of XML: structure, flexibility, > common format for > >interchange, but that you perform zero processing steps to import or > export the 'document' > >from a program. (Actually, I'm thinking this would be done in chunks, but > essentially very > >few reads and writes.) > > > > Actually, to be fair, there would be a somewhat non-trivial amount of bit > fiddlin' to get it out of whatever canonical binary format you put it in, > into the local byte order, floating point representation, byte boundary > alignment, etc... Though hopefully that couldn't be any worse than parsing > :-) Not true, especially for Java.... If you read all my 'binary' related comments, I'm not talking about storing binary data (such as IEEE doubles), but rather normal XML style text elements, attributes, and body in a 'binary' structure that gives container-like access and speed. There might be some reason to allow real binary data, but that's not really my priority. You can flash convert real binary to hex for instance very easily. The byte order, etc. will be Java standard. Shouldn't be too tough for C/C++, etc. sdw > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Mar 30 01:55:46 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:48 2004 Subject: A Simple Thought References: <87256743.006AF60F.00@d53mta03h.boulder.ibm.com> Message-ID: <37001AF4.5C9428C9@lig.net> roddey@us.ibm.com wrote: > >Ahh, there's the trick. I believe I have most of a design for an data > structure > >that is fast in memory yet is 'flat' and can have its chunks just written > out or > >read in at any point. It builds on some very old ideas I came up with for > a > >language I designed. When viewed as an interchange format, it may not be > the > >most optimal space wise (although it should be better than XML text) but > trades > >a small amount of space for nearly zero processing overhead. There will > >probably also be a procedure for 'compacting' an object for storage into a > >database or sending over a slow link vs. the 'fast' format usable between > >servers in a cluster. > > > > I think though that this would only hold up as long as you are looking at > XML data as a read-only data source. Once you started doing significant > editing of the data, having a flat structure like that would be more of a > hinderance than a help, would it not? What if I have a 10MB flat buffer and > want to add another child to the second element? This kind of gets into the > quandry that you've nailed in one nail, but now its even harder to nail a > whole raft of others as well as with the more general purpose mechanisms. > > I dunno, if I were thinking along these lines, to keep it reasonably > portable, I'd look at the binary format as a fast serialization mechanism > and at least create native language objects for each one. By the time you > put enough stream format markers and whatnot into the stream to know where > things are, and interpret those during runtime, it might be just as fast to > pay the cost for creating a much more flexible, native object format for in > memory manipulation. I have a way around the modification issue. It's a data structure I call 'elastic memory'. It's really the main reason that I'm going to have to start mostly from scratch. I AM trying to hit a number of nails at once and it won't be easy and I'm not sure I can make it perfect, however I believe I can get close. I'm only worrying about Java at the moment with some allowances for certain restrictions that come into play and typical usage in network protocols. There are a number of situations where serialization just doesn't cut it. As I mention, imagine serializing/deserializing on every method call in a program. I designed some of these mechanisms MANY years ago (about 8 I think) while designing a language after I'd already built a language based on Postscript syntax for a project. In testing the first language, I learned the horrors of 'malloc storms' that happen when you follow a typical design paradigm. My system which allowed a complex application to be represented by meta data would do about 25000 mallocs in a standard run through the app. A Java web server app for a very complex app I just completed with a team does about 150,000 object creations (measured by forcing a garbage collection) in one run through the app. It works amazingly well, but still blows most of it's processing for things that could be avoided. The cool thing is that I found a way to implement it in Java. Thanks for sparring with me! ;-) sdw > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alison at research.canon.com.au Tue Mar 30 04:25:04 1999 From: alison at research.canon.com.au (Alison Lennon) Date: Mon Jun 7 17:10:48 2004 Subject: Ampersand connector in XML Message-ID: <370035F2.D604B132@research.canon.com.au> Could someone please explain to me why the ampersand group connector of SGML was not included in XML. It seems to me that the absence of this connector results in significant problems for many applications based on XML that want to use unordered lists of elements. Cheers, Alison -- Alison Lennon, Senior Research Engineer Canon Information Systems Research Australia Pty Ltd (CISRA), 1 Thomas Holt Drive,North Ryde,Sydney, NSW 2113. Ph +61-2-9805-2931, Fax +61-2-9805-2929 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Mar 30 04:26:46 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:48 2004 Subject: XML to Text questions Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF212@RED-MSG-08> Q: "What tools are available for translation of XML into text?" A: Take a look at XSL. Information on this and other XML-related activities can be found at http://www.w3.org/XML/Activity.html. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Mar 30 04:32:48 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:48 2004 Subject: Ampersand connector in XML Message-ID: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca> At 12:24 PM 3/30/99 +1000, Alison Lennon wrote: >Could someone please explain to me why the ampersand group connector >of SGML was not included in XML. > >It seems to me that the absence of this connector results in >significant problems for many applications based on XML that want to >use unordered lists of elements. Simply because it's a lot harder to implement than all the other content model apparatus. In fact, back in SGML days, it was well-known to be buggy in several rather good and successful SGML products. Yes, its absence does represent a loss in expressive power. At the time, it seemed like a good trade-off. To me it still does, although it has particularly irked the (large and growing number of) people who want to use XML to model relational semantics. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 30 04:42:21 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:48 2004 Subject: xhtml and the p tag Message-ID: <001601be7a4e$df448500$1bf96d8c@NT.JELLIFFE.COM.AU> From: Mark D. Anderson >>In HTML 4.0 a paragraph can only contain an inline element, just the same as >>in XHTML > >Right; html is broken too. But I've already given up on html :). > >I'm still curious what a better content model for paragraphs >would be. (Sorry, i suppose this topic belongs somewhere else >even if xhtml-related; suggestions are welcome.) The distinction between text-blocks and rhetorical paragraphs is the oldest problem in markup. It is sad that HTML calls visual text blocks paragraphs, but should not be surprising or particularly troubling (it will just make it impossible to detect rhetorical paragraphs programatically from HTML documents.) The best solution is to wrap the real paragraphs in a div element:

...

    ...

...

where para cont means "paragraph contiunuation". Now that we have XSL, it is probably a good thing if HTML errs on the side of being display-structure oriented. That is why the ruby draft (which is all wrong if we want HTML to support logical markup) is probably appropriate for HTML now. Actually, there is even a higher level of paragraph, the "paragraph group", which is found in some kinds of documents (military and technical), which is where a paragraph grows a numer, a heading (often inline), footnotes and even metadata. (The reasons for this, you can find in my book, The XML and SGML Cookbook: basically the idea is that when an information block is self-contained or extractable, it naturally becomes a microdocument, getting all the accoutrements of a doument--a head an body, title, etc.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 30 04:45:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:48 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com> References: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com> Message-ID: <14080.14974.747531.703294@localhost.localdomain> roddey@us.ibm.com writes: [on using type-safe objects rather than integers as Java constants] > It does no good at all to have a named constant if you can > accidentally pass that named constant to 150 other things for which > its not intended and the compiler cannot catch it. Its a > fundamental lacking in Java that makes me shudder to think that > people actually want to do serious work in it. Yes, but it's also no good having a named constant that you cannot use in a switch statement. Unfortunately, Java is broken here, and you have to choose one side or another All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 30 04:45:54 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:48 2004 Subject: XHTML and character entities Message-ID: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU> From: Gabe Beged-Dov >I mention tidy below but am asking about html->xhtml conversion in >general. > >I use tidy to to convert html to xhtml using the -asxml switch. The >result of many conversions is still not accepted as well-formed because >entities like agrave and friends aren't defined unless you process the >DTD. > >Wouldn't it be reasonable to convert these to character entities as part >of the html->xhtml process? With tidy, you have to be a little creative with the switches. For example, to process Big5 text, we have to use "-latin1". Certainly it is the expectation of some people that the entities for special characters will disappear with XML, that people will use NCRs. I am not sure about it. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alison at research.canon.com.au Tue Mar 30 04:47:05 1999 From: alison at research.canon.com.au (Alison Lennon) Date: Mon Jun 7 17:10:48 2004 Subject: Ampersand connector in XML References: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca> Message-ID: <37003AE2.7B13728D@research.canon.com.au> Tim Bray wrote: > > At 12:24 PM 3/30/99 +1000, Alison Lennon wrote: > >Could someone please explain to me why the ampersand group connector > >of SGML was not included in XML. > > > >It seems to me that the absence of this connector results in > >significant problems for many applications based on XML that want to > >use unordered lists of elements. > > Simply because it's a lot harder to implement than all the other > content model apparatus. In fact, back in SGML days, it was well-known > to be buggy in several rather good and successful SGML products. Yes, > its absence does represent a loss in expressive power. At the time, > it seemed like a good trade-off. To me it still does, although it > has particularly irked the (large and growing number of) people who > want to use XML to model relational semantics. -Tim Is it likely to be included in later versions of XML? In other words, what are the options for applications which need to use unordered lists - SGML? Alison -- Alison Lennon, Senior Research Engineer Canon Information Systems Research Australia Pty Ltd (CISRA), 1 Thomas Holt Drive,North Ryde,Sydney, NSW 2113. Ph +61-2-9805-2931, Fax +61-2-9805-2929 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Mar 30 04:54:43 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:48 2004 Subject: Ampersand connector in XML Message-ID: <3.0.32.19990329185409.00c2f630@pop.intergate.bc.ca> At 12:45 PM 3/30/99 +1000, Alison Lennon wrote: >Is it likely to be included in later versions of XML? Not impossible. There are some people on the schema group who'd like to bring it back. But by no means a sure thing. >In other words, >what are the options for applications which need to use unordered >lists - SGML? Yep. Or write your own code to validate the unordered-list elements; since any nontrivial business application is going to need some extra validation logic past what the DTD can do anyhow, this is probably not too burdensome. Another approach is to generate your documents in such a way that you sort the unordered-list elements by any old criterion at all, so that they become ordered-list elements; then use a simpler content model. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Mar 30 05:01:57 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:49 2004 Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG)) Message-ID: <003201be7a51$8f8c0350$1bf96d8c@NT.JELLIFFE.COM.AU> From: David Megginson >What I did say is that there's not a practical difference among the >different alternatives in XML and SGML for expressing this >information, and probably not enough to justify the parallel >maintenance of the two as discrete standards. I don't agree because 1) XML is not a standard, because W3C is not an open process but a friendly conspiracy of vendors and boffins who must kowtow to Microsoft and TBL (not to say that these are not excellent activities). 2) XML and SGML have fundamentally different application areas driving them: * SGML is a compiler compiler where the central technical question is "people want markup in lots of different formats; how can we make a parser to detect the structure in as many of their formats as possible?" If you have shortrefs you must have maps and you must have entities and you must have minimization: they are justifiable because SGML is a parser technologym not an information-modeling technology. * XML just expands the butt of SGML: the fact that there are tree/graph structures in marked-up data. Now, I admit that butt-expansion is a natural function of time: SGML's default delimiters (as used in HTML and SGML at many companies) are now familiar enough that there is also a question "people want markup in SGML-delimiter format: how can we make a (simple) parser that detects the structure in just that?" Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Mar 30 05:27:17 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:10:49 2004 Subject: Is there anyone working on a binary version of XML? References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> <36FAEE03.E435EB6D@lig.net> Message-ID: <36FB56EF.7E3E87CE@manhattanproject.com> "Stephen D. Williams" wrote: > Imagine that you have all the features of XML: structure, flexibility, common format for > interchange, but that you perform zero processing steps to import or export the 'document' > from a program. (Actually, I'm thinking this would be done in chunks, but essentially very > few reads and writes.) I had an idea to accomplish something similar to this using notations. First use a fixed width encoding, and then provide an index to the information contained within the XML document in a notation. This way you get many of the advantages above, but your information is still XML, so that it can be read by a parser who may not understand the indexing notation. Anyway, I havn't had time to work on it more, but here was a crude, first-pass at explaining the idea I posted to the list a while back. I hope it helps. Clark Evans -------- Original Message -------- Subject: Fractal XML Index Notation Date: Wed, 03 Feb 1999 01:32:34 +0000 From: Clark Evans To: xml-dev@ic.ac.uk References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4> Abstract: By fixing the content of an XML file, a position based index mechanism can be added to XML files, allowing fractal parsing. Introduction: In a thin-client/server environment, especially those implemented in an interpreted language, like Java, is important to minimise client-side processing by doing server-side pre-processing. For example, suppose that an on-line shopping web site has a thin-client ordering java applet. It could quickly download, and start accepting customer information, and other input. Simoutanenously, it could be downloading a 250K+ file(s) containing the package and product list, authorized shipping agents, tax calculation tables, etc. Advanced versions of the applet would "cashe" a copy of the catalog locally, and only download deltas. Several pre-processing items could occur, the most obvious being a translation of the normalized schema: PRODUCT_CATEGORY (CATEGORY_ID, CATEGORY_NAME) BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE) VENDOR (VENDOR_ID, VENDOR_NAME) BUNDLE-PRODUCT (BUNDLE_ID,PRODUCT_ID) PRODUCT (PRODUCT_ID, PRODUCT_NAME, CATEGORY_ID, INVIDUAL_SALE_FLAG, PRICE_IF_SOLD_INDIVIDUALLY ) PRODUCT-VENDOR (PRODUCT_ID,VENDOR_ID) BUNDLE-VENDOR (BUNDLE_ID,VENDOR_ID) into a hierarchical drill-down that better meets the particular needs of the order-entry client: In this example, several joins are interwoven into a a single hierarchical "snapshot" to support the the drill down requirements in the order-entry client. Notice, that product-bundles, products, and vendors *will* be duplicated with this scheme, this de-normalization is exactly what is required since it makes the processing on the client simpler. Here XML complements the relational database by providing a de-normalized stream of data instead of a normalized repository. For another example, suppose a roaming-sales person receives an update every morning in his e-mail with new products, discontinued products, changes in pricing, packaging, etc. Then, during the day, the sales peson goes "door-to-door" selling the products and taking orders. The orders are collected on his/her hard drive untill the evening, when they are uploaded to the server for approval. I see XML as a great move forward in a standard transport layer for this form of communication. Each order could be a simple e-mail message, leveraging existing POP3/SMTP standards. The messages would be queued during the day, and send after the sales person is connected to the network. In a similar way, the updates to the product could be sent as via e-mail (xml-mail anyone?) as well. THUS, we have moved the join from the client to the server, but now, we have *increased* the parsing requirements of the client... also, with a _large_ catelog file (3+MB?), it is unreasonable to think that a collection of objects in memory would be the result of the parsing. THEREFORE, some form of storage/retrieval is necessary on the client. This can be in a local database, but that just increases the footprint and processing. Instead of making a client-side database, and re-normalizing the information, I suggest that indexing the XML file may be a better alternative. A way to do this, is to "fix" the XML file's binary representaion, and build a physical index detailing the "exact" location of an element within the file. Requirement for such an index: a) It should be embeddable inside XML, and should follow XML if possible (perhaps it is a notation?) b) It should allow indexing on arbitrary element attributes. c) It should be created so that a change in one part of the file has minimal impact on the rest of the XML file. Thus, although a change to a child may require a re-adjustment of information about it's parent, it shouldn't require re-adjustment of information about each sibling. d) It should take advantage of the "hierarchy" built into the XML file, since the thin-client usage will directly correspond to the "hierachy" e) It should support typed entities and attributes "Archetecutres", so that different attribute names of sub-types can be indexed together. f) Indexing an element based upon it's child elements may not be required. If an index like this is needed, perhaps a re-write adds an attribute with the computed value and then this is indexed instead. g) Working with linking is purely optional, and may not be important to support. If you are using linking with transaction-oriented documents, you should be using a relational database instead. I see XML as bringing back the Hierarchical database to *complement* relational technology, not to *replace* it. ================================================ What I propose is a "fractal" index inter-woven into the XML data. First, here is the file to be indexed: ... ... ... ... Here is the "indexed" example, I use line numbers for the demonstration since it is easier to show in e-mail form, however, I would see it being done by position instead. I also use 0001 ... 0009 0010 0011 0012 0013 0014 0015 0016 0017 0018 ... 0033 ... 0533 0534 name="Price" 0535 index-start=525 0536 delimiter="|" 0536 position-width=4 0537 length=100 0538 > 0539 0540 0541 0542 0543 0004|Allen-Wrench Set | ... 05?? 0005|Household-Starter | ... 05?? 0008|Allen-Wrench Set | ... 0632 0633 0635 index-start=625 0636 delimiter="|" 0636 position-width=4 0637 length=100 0638 > 0639 0640 0641 0642 0643 0433|01.23 ... 06?? 0002|06.95 ... 06?? 0005|23.99 ... 0732 .... ???? ???? ???? ???? ... ???? ... ???? 0000 0000 ============================== .... "Stephen D. Williams" wrote: > Imagine that you have all the features of XML: structure, flexibility, common format for > interchange, but that you perform zero processing steps to import or export the 'document' > from a program. (Actually, I'm thinking this would be done in chunks, but essentially very > few reads and writes.) I had an idea to accomplish something similar to this using notations. First use a fixed width encoding, and then provide an index to the information contained within the XML document in a notation. This way you get many of the advantages above, but your information is still XML, so that it can be read by a parser who may not understand the indexing notation. Anyway, I havn't had time to work on it more, but here was a crude, first-pass at explaining the idea I posted to the list a while back. I hope it helps. Clark Evans -------- Original Message -------- Subject: Fractal XML Index Notation Date: Wed, 03 Feb 1999 01:32:34 +0000 From: Clark Evans To: xml-dev@ic.ac.uk References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4> Abstract: By fixing the content of an XML file, a position based index mechanism can be added to XML files, allowing fractal parsing. Introduction: In a thin-client/server environment, especially those implemented in an interpreted language, like Java, is important to minimise client-side processing by doing server-side pre-processing. For example, suppose that an on-line shopping web site has a thin-client ordering java applet. It could quickly download, and start accepting customer information, and other input. Simoutanenously, it could be downloading a 250K+ file(s) containing the package and product list, authorized shipping agents, tax calculation tables, etc. Advanced versions of the applet would "cashe" a copy of the catalog locally, and only download deltas. Several pre-processing items could occur, the most obvious being a translation of the normalized schema: PRODUCT_CATEGORY (CATEGORY_ID, CATEGORY_NAME) BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE) VENDOR (VENDOR_ID, VENDOR_NAME) BUNDLE-PRODUCT (BUNDLE_ID,PRODUCT_ID) PRODUCT (PRODUCT_ID, PRODUCT_NAME, CATEGORY_ID, INVIDUAL_SALE_FLAG, PRICE_IF_SOLD_INDIVIDUALLY ) PRODUCT-VENDOR (PRODUCT_ID,VENDOR_ID) BUNDLE-VENDOR (BUNDLE_ID,VENDOR_ID) into a hierarchical drill-down that better meets the particular needs of the order-entry client: In this example, several joins are interwoven into a a single hierarchical "snapshot" to support the the drill down requirements in the order-entry client. Notice, that product-bundles, products, and vendors *will* be duplicated with this scheme, this de-normalization is exactly what is required since it makes the processing on the client simpler. Here XML complements the relational database by providing a de-normalized stream of data instead of a normalized repository. For another example, suppose a roaming-sales person receives an update every morning in his e-mail with new products, discontinued products, changes in pricing, packaging, etc. Then, during the day, the sales peson goes "door-to-door" selling the products and taking orders. The orders are collected on his/her hard drive untill the evening, when they are uploaded to the server for approval. I see XML as a great move forward in a standard transport layer for this form of communication. Each order could be a simple e-mail message, leveraging existing POP3/SMTP standards. The messages would be queued during the day, and send after the sales person is connected to the network. In a similar way, the updates to the product could be sent as via e-mail (xml-mail anyone?) as well. THUS, we have moved the join from the client to the server, but now, we have *increased* the parsing requirements of the client... also, with a _large_ catelog file (3+MB?), it is unreasonable to think that a collection of objects in memory would be the result of the parsing. THEREFORE, some form of storage/retrieval is necessary on the client. This can be in a local database, but that just increases the footprint and processing. Instead of making a client-side database, and re-normalizing the information, I suggest that indexing the XML file may be a better alternative. A way to do this, is to "fix" the XML file's binary representaion, and build a physical index detailing the "exact" location of an element within the file. Requirement for such an index: a) It should be embeddable inside XML, and should follow XML if possible (perhaps it is a notation?) b) It should allow indexing on arbitrary element attributes. c) It should be created so that a change in one part of the file has minimal impact on the rest of the XML file. Thus, although a change to a child may require a re-adjustment of information about it's parent, it shouldn't require re-adjustment of information about each sibling. d) It should take advantage of the "hierarchy" built into the XML file, since the thin-client usage will directly correspond to the "hierachy" e) It should support typed entities and attributes "Archetecutres", so that different attribute names of sub-types can be indexed together. f) Indexing an element based upon it's child elements may not be required. If an index like this is needed, perhaps a re-write adds an attribute with the computed value and then this is indexed instead. g) Working with linking is purely optional, and may not be important to support. If you are using linking with transaction-oriented documents, you should be using a relational database instead. I see XML as bringing back the Hierarchical database to *complement* relational technology, not to *replace* it. ================================================ What I propose is a "fractal" index inter-woven into the XML data. First, here is the file to be indexed: ... ... ... ... Here is the "indexed" example, I use line numbers for the demonstration since it is easier to show in e-mail form, however, I would see it being done by position instead. I also use 0001 ... 0009 0010 0011 0012 0013 0014 0015 0016 0017 0018 ... 0033 ... 0533 0534 name="Price" 0535 index-start=525 0536 delimiter="|" 0536 position-width=4 0537 length=100 0538 > 0539 0540 0541 0542 0543 0004|Allen-Wrench Set | ... 05?? 0005|Household-Starter | ... 05?? 0008|Allen-Wrench Set | ... 0632 0633 0635 index-start=625 0636 delimiter="|" 0636 position-width=4 0637 length=100 0638 > 0639 0640 0641 0642 0643 0433|01.23 ... 06?? 0002|06.95 ... 06?? 0005|23.99 ... 0732 .... ???? ???? ???? ???? ... ???? ... ???? 0000 0000 ============================== .... Message-ID: <37005634.58EBEE1B@lig.net> Excellent. I've had similar ideas. My current plan is to produce something without the requirement that the result be pure text, however I once toyed with the idea of a database where all indexing information was stored as part of the text in fixed width fields. The file could be edited with any text editor and then 'reindexed' and be ready for fast use. Your design is pretty handy, but I really want something that can be loaded, have a minor modification made with minimal data shuffling, and then 'saved' out very quickly. Having to rebuild a complete index probably isn't the most optimal way to do this. sdw Clark Evans wrote: > "Stephen D. Williams" wrote: > > Imagine that you have all the features of XML: structure, flexibility, common format for > > interchange, but that you perform zero processing steps to import or export the 'document' > > from a program. (Actually, I'm thinking this would be done in chunks, but essentially very > > few reads and writes.) > > I had an idea to accomplish something similar to this using notations. > First use a fixed width encoding, and then provide an index to the > information contained within the XML document in a notation. This way > you get many of the advantages above, but your information is still XML, > so that it can be read by a parser who may not understand the indexing notation. > > Anyway, I havn't had time to work on it more, but here was a > crude, first-pass at explaining the idea I posted to the list > a while back. I hope it helps. > > Clark Evans > > -------- Original Message -------- > Subject: Fractal XML Index Notation > Date: Wed, 03 Feb 1999 01:32:34 +0000 > From: Clark Evans > To: xml-dev@ic.ac.uk > References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4> > > Abstract: > > By fixing the content of an XML file, a > position based index mechanism can be added > to XML files, allowing fractal parsing. > > Introduction: > > In a thin-client/server environment, especially those > implemented in an interpreted language, like Java, > is important to minimise client-side processing by > doing server-side pre-processing. > > For example, suppose that an on-line shopping web > site has a thin-client ordering java applet. It could > quickly download, and start accepting customer > information, and other input. Simoutanenously, > it could be downloading a 250K+ file(s) containing > the package and product list, authorized shipping > agents, tax calculation tables, etc. Advanced > versions of the applet would "cashe" a copy of the > catalog locally, and only download deltas. > > Several pre-processing items could occur, the most > obvious being a translation of the normalized schema: > > PRODUCT_CATEGORY (CATEGORY_ID, CATEGORY_NAME) > BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE) > VENDOR (VENDOR_ID, VENDOR_NAME) > BUNDLE-PRODUCT (BUNDLE_ID,PRODUCT_ID) > PRODUCT (PRODUCT_ID, PRODUCT_NAME, > CATEGORY_ID, INVIDUAL_SALE_FLAG, > PRICE_IF_SOLD_INDIVIDUALLY ) > PRODUCT-VENDOR (PRODUCT_ID,VENDOR_ID) > BUNDLE-VENDOR (BUNDLE_ID,VENDOR_ID) > > into a hierarchical drill-down that better meets > the particular needs of the order-entry client: > > > > > > > > > > In this example, several joins are interwoven into a > a single hierarchical "snapshot" to support the > the drill down requirements in the order-entry client. > > Notice, that product-bundles, products, and vendors > *will* be duplicated with this scheme, this de-normalization > is exactly what is required since it makes the processing > on the client simpler. Here XML complements the > relational database by providing a de-normalized > stream of data instead of a normalized repository. > > For another example, suppose a roaming-sales person > receives an update every morning in his e-mail with > new products, discontinued products, changes in pricing, > packaging, etc. Then, during the day, the sales peson > goes "door-to-door" selling the products and taking orders. > The orders are collected on his/her hard drive untill > the evening, when they are uploaded to the server for > approval. > > I see XML as a great move forward in a standard transport > layer for this form of communication. Each order could > be a simple e-mail message, leveraging existing POP3/SMTP > standards. The messages would be queued during the day, > and send after the sales person is connected to the > network. In a similar way, the updates to the product > could be sent as via e-mail (xml-mail anyone?) as well. > > THUS, we have moved the join from the client to the > server, but now, we have *increased* the parsing > requirements of the client... also, with a _large_ > catelog file (3+MB?), it is unreasonable to think > that a collection of objects in memory would > be the result of the parsing. > > THEREFORE, some form of storage/retrieval is necessary > on the client. This can be in a local database, > but that just increases the footprint and processing. > > Instead of making a client-side database, and > re-normalizing the information, I suggest that > indexing the XML file may be a better alternative. > A way to do this, is to "fix" the XML file's binary > representaion, and build a physical index detailing > the "exact" location of an element within the file. > > Requirement for such an index: > > a) It should be embeddable inside XML, and should follow > XML if possible (perhaps it is a notation?) > > b) It should allow indexing on arbitrary element attributes. > > c) It should be created so that a change in one part of the > file has minimal impact on the rest of the XML file. Thus, > although a change to a child may require a re-adjustment > of information about it's parent, it shouldn't require > re-adjustment of information about each sibling. > > d) It should take advantage of the "hierarchy" built > into the XML file, since the thin-client usage will > directly correspond to the "hierachy" > > e) It should support typed entities and attributes > "Archetecutres", so that different attribute names > of sub-types can be indexed together. > > f) Indexing an element based upon it's child elements > may not be required. If an index like this is needed, > perhaps a re-write adds an attribute with the > computed value and then this is indexed instead. > > g) Working with linking is purely optional, and may > not be important to support. If you are > using linking with transaction-oriented documents, > you should be using a relational database instead. > I see XML as bringing back the Hierarchical database > to *complement* relational technology, not to > *replace* it. > > ================================================ > > What I propose is a "fractal" index inter-woven > into the XML data. First, here is the file to > be indexed: > > > > > > > > > > > > ... > > ... > > > > > ... > > ... > > > Here is the "indexed" example, I use line numbers for > the demonstration since it is easier to show in e-mail > form, however, I would see it being done by position instead. > I also use > > 0001 > ... > 0009 > 0010 > 0011 > 0012 price="6.95"/> > 0013 price="7.95"/> > 0014 > 0015 > 0016 > 0017 > 0018 > ... > 0033 > ... > 0533 category --> > 0534 name="Price" > 0535 index-start=525 > 0536 delimiter="|" > 0536 position-width=4 > 0537 length=100 > 0538 > > 0539 > 0540 attribute="price" /> > 0541 /> > 0542 > 0543 0004|Allen-Wrench Set | > ... > 05?? 0005|Household-Starter | > ... > 05?? 0008|Allen-Wrench Set | > ... > 0632 > 0633 0634 name="Price" > 0635 index-start=625 > 0636 delimiter="|" > 0636 position-width=4 > 0637 length=100 > 0638 > > 0639 > 0640 attribute="price" /> > 0641 /> > 0642 > 0643 0433|01.23 > ... > 06?? 0002|06.95 > ... > 06?? 0005|23.99 > ... > 0732 > .... > ???? > ???? > ???? > ???? > ... > ???? name="Price" > ... > ???? name="" > ... > ???? > ... > ???? > 0000 > 0000 > > ============================== > > .... > > non-XML filter project Message-ID: <00cd01be7a6c$0d879ac0$0300000a@cygnus.uwa.edu.au> Earlier this month, I posted the following to XSL-LIST. With apologies to those who received it there, I'm posting it (modified) here to see if anyone is interested in some co-operative effort in this area. What I would like to see is people taking existing non-XML formats and developing: a) a URI for the non-XML format (for notations and for the namespace of the XML format) b) a DTD representing the existing non-XML format c) an output filter to convert documents conforming to the DTD into the non-XML format d) (possibly) an input filter to convert the non-XML format into XML There are individual cases of this sort of thing[1] but I would like to see some sort of co-operative effort to produce a large number of these things. I'm not envisaging complex filters, just a simple XML representation of the non-XML format so that purely XML tools like editors, query engines, XSL engines can operate on non-XML formats. There are plently of applications including generation of these files on the basis of other XML documents (I need this for Makefiles on my websites) and literate programming. I would personally find great value in this being done for Makefiles, procmail files, simple shell scripts and PalmPilot databases. Others of value I can think of include Windows INI files, Unix mailboxes, your favourite programming language... If there is enough interest I am more than willing to coordinate these efforts. Just let me know. James [1] http://www.xmlsoftware.com/convert/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From heikki at citec.fi Tue Mar 30 08:03:12 1999 From: heikki at citec.fi (Heikki Toivonen) Date: Mon Jun 7 17:10:49 2004 Subject: OFF: Attachments to list Message-ID: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi> Instead of flaming people who send attachments to this list, why not use an automated tool that refuses to send messages with attachments to this list? For example, the Frame Users list (see http://www.FrameUsers.com) has a system that checks if messages contains attachments or other illegal stuff. If the screening program thinks something is wrong, it will reply to the original sender with the message, explaining what was wrong. One has to be careful with vCards, though, because many people do not realize they are considered attachments (happened even to me, I used vCard at one time as my .sig and I could not understand why FrameUsers bot was saying I had an attachment in my email:). -- Heikki Toivonen http://www.doczilla.com http://www.citec.fi xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Tue Mar 30 09:45:27 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:10:49 2004 Subject: Is there anyone working on a binary version of XML? References: <87256743.0062192B.00@d53mta03h.boulder.ibm.com> <3700177B.AC5BFE07@lig.net> Message-ID: <370080F7.DBC8C96D@toolsmiths.se> "Stephen D. Williams" wrote: > roddey@us.ibm.com wrote: > > > Actually, to be fair, there would be a somewhat non-trivial amount of bit > > fiddlin' to get it out of whatever canonical binary format you put it in, > > into the local byte order, floating point representation, byte boundary > > alignment, etc... Though hopefully that couldn't be any worse than parsing > > :-) > > Not true, especially for Java.... > > ... > The byte order, etc. will be Java standard. Shouldn't be too tough for C/C++, etc. Its also very easy for C/C++, There exists a number of standards and OpenSource packages which support them. After this "bit fiddling" have been taken care of, the road is clear of obstacles. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Mar 30 10:03:46 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:10:50 2004 Subject: OFF: Attachments to list References: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi> Message-ID: <3700847B.84F584E4@w3.org> Heikki Toivonen wrote: > > Instead of flaming people who send attachments to this list, why not use an > automated tool that refuses to send messages with attachments to this list? That needs a better definition of "attachment". Is a text/plain MIME bodypart an attachment or not? What if it has two of them? > One has to be careful with vCards, though, because many people do not > realize they are considered attachments (happened even to me, I used vCard > at one time as my .sig and I could not understand why FrameUsers bot was > saying I had an attachment in my email:). Similarly, many people do not realise tha they are using HTML mail or that their (plain text) signature file is being included as a separate bodypart. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From om at lgsi.co.in Tue Mar 30 10:34:07 1999 From: om at lgsi.co.in (Om Band) Date: Mon Jun 7 17:10:50 2004 Subject: HELP Me : Throwing XML page through Servlet Message-ID: <37007F73.59ADFCB9@lgsi.co.in> Hi, Kan U Help Mi Please ? I am developing a search engine which will have a XML search form linked with a Servlet. The Servlet should take input from the textfield of the XML form, scan the database for matches & generate an XML page with the results found.(Dynamically) ! Ideally it should not make a file of that XML but should throw it directly to the client m/c. In this case I am not able to link it with the already created XSL stylesheet which will already be on the server (Static). Even with making a separate XML file I am not able to display the XML page through Servlet, though directly it could be displayed with the same address typed in Address field of the browser !! The code I am using for Servlet is............ (This makes a separate file) doPost(HttpServletRequest, HttpServletResponse response) {------- -------- String file = "c:\\xml\\file.xml"; fw = new FileWriter(file); pw = new PrintWriter(fw); pw.println(" > I'm still shying away from reporting element-type > declarations, at least until someone shows me an easy and concise way > of doing it I would think the best way is to present it in its XSchema form, to some kind of secondary DocumentHandler. Mike Kay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/b0770eee/attachment.htm From Michael.Kay at icl.com Tue Mar 30 11:20:21 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:10:50 2004 Subject: Is there anyone working on a binary version of XML? Message-ID: <93CB64052F94D211BC5D0010A80013310EB3C7@WWMESS3.172.19.125.2> > I have come to feel however that there is room for a > "works-as-if" binary analogue to text based XML. I did various experiments with this a while back. I tried a serialised SAX event stream, a simple canonicalisation and transcoding in which the special characters like "<" were replaced with octet values AbiWord is another (the third?) open source word processor that will use an XML document type for its ntaive format. One interesting thing about this one is that it is intended to be portable between Unix and Windows. That means that it is at least theoretically possible that a large, heterogeneous corporation could standardize on it. http://www.abisource.com -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Mar 30 15:02:17 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:50 2004 Subject: Fw: XML query language Message-ID: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il> Paul Janssens wrote: >I think (iii) (results should be XML) >should not be a requirement of an XML query language. The >result of a query could be a vector of tuples of pointers to the >individual matches. Whatever needs to be done with that output can be >done in a layer above that. I fail to see the benfit in inventing a new format for query results. First, a set of tuples with pointers, or whatever else, can be easily expressed in XML. Second, if one wants to obtain 'pointers to the output', then it should be a simple matter of constructing in the result a pointer to the matched tree ( or something) instead of the matched tree itself. AFAIK all XML QL proposals produce XML as output. >Just because SQL mixes content with style >doesn't mean an XML query language should. You lost me here; this is the first time I've heard that SQL has anything to do with style. The result of an SQL query is a table and is typically accessed via some programming API which has nothing to do with presentation. I agree that an XML query should do the same thing - that is, create an XML tree as a result without worrying about presentation. The fact that I think that _the transformational part_ of XSL should do this is perfectly consistent, since I see this part as being a general independent mechanism and not just a "style" language. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Mar 30 15:52:05 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:10:50 2004 Subject: XML query language Message-ID: Has anyone looked at the fragment proposals for this? I think it is ideal because it allows you to return the context of the nodes that make up your results, as well as the results. If you add functionality for multiple fragments (which the initial suggestions don't have) then it works very well, allowing nodes to be returned that come from very different contexts. Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net > -----Original Message----- > From: Oren Ben-Kiki > Sent: 30 March 1999 13:58 > To: XML List > Subject: Fw: XML query language > > > Paul Janssens wrote: > >I think (iii) > (results should be XML) > >should not be a requirement of an XML query language. The > >result of a query could be a vector of tuples of pointers to the > >individual matches. Whatever needs to be done with that output can be > >done in a layer above that. > > I fail to see the benfit in inventing a new format for query > results. First, > a set of tuples with pointers, or whatever else, can be > easily expressed in > XML. Second, if one wants to obtain 'pointers to the output', > then it should > be a simple matter of constructing in the result a pointer to > the matched > tree ( or something) instead of the matched tree itself. > > AFAIK all XML QL proposals produce XML as output. > > >Just because SQL mixes content with style > >doesn't mean an XML query language should. > > You lost me here; this is the first time I've heard that SQL > has anything to > do with style. The result of an SQL query is a table and is typically > accessed via some programming API which has nothing to do > with presentation. > I agree that an XML query should do the same thing - that is, > create an XML > tree as a result without worrying about presentation. The > fact that I think > that _the transformational part_ of XSL should do this is perfectly > consistent, since I see this part as being a general > independent mechanism > and not just a "style" language. > > Share & Enjoy, > > Oren Ben-Kiki > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Tue Mar 30 16:06:38 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:10:50 2004 Subject: XML to Text questions In-Reply-To: Your message of "Mon, 29 Mar 1999 09:48:00 PST." <5BF896CAFE8DD111812400805F1991F708AAF212@RED-MSG-08> Message-ID: <199903301403.CAA03415@aniwa.sky> > Q: "What tools are available for translation of XML into text?" > > A: Take a look at XSL. Information on this and other XML-related > activities can be found at http://www.w3.org/XML/Activity.html. This is potentially misleading. XSL produces as output an XML document. It cannot be made to produce text which is not well-formed. Specific XSL processors may provide extensions to handle this. I believe SAXON does. Other alternatives worth considering might include DSSSL and perl. Andrew McNaughton -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 30 16:11:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:50 2004 Subject: OFF: Attachments to list In-Reply-To: <3700847B.84F584E4@w3.org> References: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi> <3700847B.84F584E4@w3.org> Message-ID: <14080.45092.316461.553925@localhost.localdomain> Chris Lilley writes: > Similarly, many people do not realise tha they are using HTML mail or > that their (plain text) signature file is being included as a separate > bodypart. Bounce every posting containing a bodypart with a MIME type other than text/plain. People will figure out about vcards and HTML mail quite fast that way, especially if it's possible to give a reasonably informative message (with a note about vcards). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Tue Mar 30 16:29:30 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:50 2004 Subject: XML query language References: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il> Message-ID: <3700DF27.302D@skynet.be> Oren Ben-Kiki wrote: > > Paul Janssens wrote: > >I think (iii) > (results should be XML) > >should not be a requirement of an XML query language. The > >result of a query could be a vector of tuples of pointers to the > >individual matches. Whatever needs to be done with that output can be > >done in a layer above that. > > I fail to see the benfit in inventing a new format for query results. First, > a set of tuples with pointers, or whatever else, can be easily expressed in > XML No problem there, my point was that ONLY this information should be the output of a query, preferably in an XML format :-) > Second, if one wants to obtain 'pointers to the output', then it should > be a simple matter of constructing in the result a pointer to the matched > tree ( or something) instead of the matched tree itself. > > AFAIK all XML QL proposals produce XML as output. > > >Just because SQL mixes content with style > >doesn't mean an XML query language should. > > You lost me here; this is the first time I've heard that SQL has anything to > do with style. The result of an SQL query is a table and is typically > accessed via some programming API which has nothing to do with presentation. > I agree that an XML query should do the same thing - that is, create an XML > tree as a result without worrying about presentation. The fact that I think > that _the transformational part_ of XSL should do this is perfectly > consistent, since I see this part as being a general independent mechanism > and not just a "style" language. Ok, sql ALLOWS you to mix style (or semantics) with content, as in SELECT ''||col2||'' FROM table1 For the same reason, if an xml query language allows you to arbitrarily construct result trees, lazy users will abuse that feature to put style or semantics in the output so they will not have to postprocess it with XSL. If on the other hand, only pointers to the resulting matches are returned by the query language, anyone that wants an output is FORCED to use XSL. In my opinion, an xml query language should only describe a set of equations, an xml query language implementation should only solve these equations, and whatever is done with the result is NO business of the query language. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 30 16:59:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:50 2004 Subject: Ampersand connector in XML In-Reply-To: <37003AE2.7B13728D@research.canon.com.au> References: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca> <37003AE2.7B13728D@research.canon.com.au> Message-ID: <14080.44628.857723.958297@localhost.localdomain> Alison Lennon writes: [on '&' in content models] > Is it likely to be included in later versions of XML? In other > words, what are the options for applications which need to use > unordered lists - SGML? Actually, the options are somewhat broader than that. There are two reasons that people have traditionally wanted to use '&' in content models: 1. to help with legacy data conversion, where the elements may be out of order during an intermediate stage; or 2. because there is no obvious reason to order the content. You don't need (1), because XML allows you simply to process the document without a DTD until it's cleaned up. Through more than a decade of industry experience, nearly everyone in the SGML world ended up agreeing that (2) was a lousy idea -- the '&' connect makes it very difficult for authors to create documents in SGML editing tools, and as Tim Bray pointed out, the tools often got it wrong anyway. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 30 16:59:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:50 2004 Subject: XHTML and character entities In-Reply-To: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU> References: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU> Message-ID: <14080.44181.839780.143963@localhost.localdomain> Rick Jelliffe writes: > Certainly it is the expectation of some people that the entities > for special characters will disappear with XML, that people will > use NCRs. I am not sure about it. I think that Rick makes a good point here (we touched on this point earlier in a different context). There are two problems: 1. some XML documents will *always* need characters not available through Unicode either directly or through composition, no matter how large Unicode grows; and 2. representing new characters through numeric references in the private-use area is unintuitive. Internal SDATA entities were (and are) the bane of people trying to write generic SGML processing software, but they were very useful for small utilities tied closely to a specific SGML application (such as an academic project for transcribing manuscripts, where you knew in advance what SDATA entities you were going to see). On the other hand, there were actually proposals back in th'old days to use Unicode values for SDATA strings rather than the (in)famous "[eacute]" type strings. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Mar 30 17:04:59 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:50 2004 Subject: XML <-> non-XML filter project References: <00cd01be7a6c$0d879ac0$0300000a@cygnus.uwa.edu.au> Message-ID: <3700F006.45A98468@lig.net> I like this idea and a few weeks ago was evangelizing a similar idea: Bear with me, there is a good XML tie in at the end... I was considering what was wrong with the way that OS and application configuration is handled typically. Of course NT can be a nightmare because of registry problems and centrality. Unix/Linux is somewhat easier to manage but still needs changes distributed throughout a filesystem tree, often with minor variations between Unix vendors, Linux, BSD, etc. Furthermore, there are a few obvious goals that come to mind in designing a perfect system administration environment: Applications and OS modules should be 'object oriented' in the sense that as much as possible all programs, data, files, temp space, logging, and especially configuration are stored in an area partitioned from everything else in a predictable way. This could mean that you have one directory that contains everything and is referenced one way or another from all of the appropriate subsystems. Configuration should be straightfoward, portable (between OS's, hardware, etc.) and easily editable in most circumstances. Upgrading the OS to a new version, distribution, etc. should be trivial and not require reinstalling all applications. Conversely, it should be trivial to copy an application to another system with the same OS. This includes easy backups and restores. I can see two major ways to solve these problems: Modification to standard subsystems and/or OS initialization sequences to expect modular installations of applications. For instance, Oracle requires userid's in /etc/passwd, OS parameter changes, daemon startup in /etc/rc.d/init.d, environment variables for all or most users, etc. etc. Normally you change all kinds of things, add it to your path, add the libraries to your path, include directories, Java library to your CLASSPATH, etc. All of this (everything except possibly allocation of data space, although that's feasible also in simple or default cases) should go into /opt/oracle (for instance) in ways that are automatically picked up by boot up and/or user login actions. For instance, I typically modify /etc/profile once to add /opt/*/bin to PATH and /opt/*/bin/lib/*.jar to CLASSPATH, etc. It would be fairly easy to add users virtually to /etc/passwd with PAM modifications. System parameters could be computed by the max of all mentions in /opt/*/config/osparam. Environments could have all /opt/*/config/profile contents 'sourced'. Etc. In fact, the base OS installation (say a Linux distribution) should be read-only and all changes made in an logical 'overlay' tree. Because all of this requires cooperation with people defining the 'correct' way to do things and those putting together distributions or OS versions, I came up with another way that is almost equivalent: For many things, especially standard OS parameters, configuration can be indicated in a nearly generic way by creating logical XML files in, say, /config. These files could easily handle most common configuration and be operated on by installers with a standard feature set but which are built specifically for the operating system they run on. As an example, /config/network.xml could contain system name, domain name, network IP addresses, masks, routes, etc. Services to start at boot and/or login could be listed and controlled. Users to add to the box could be configured. Filesystems to export, etc. These files could be used on any OS and a local installer would know how to install the equivalent configuration into native config files, along with restarting daemons or reloading configuration. This would completely eliminate, for many users and purposes, any problems with fluctuations with how a particular Unix stores system name (which varies) or network configuration (which varies), etc. I have worked with something like 10-15 different Unix OS which all vary more from a system administration standpoint than anything. Oddly enough, this would work just as well for Win98 or WinNT since the installer could update the registry appropriately. Is there some reason we haven't done this already? sdw James Tauber wrote: > Earlier this month, I posted the following to XSL-LIST. With apologies to > those who received it there, I'm posting it (modified) here to see if anyone > is interested in some co-operative effort in this area. > > What I would like to see is people taking existing non-XML formats and > developing: > > a) a URI for the non-XML format (for notations and for the namespace of > the XML format) > b) a DTD representing the existing non-XML format > c) an output filter to convert documents conforming to the DTD into the > non-XML format > d) (possibly) an input filter to convert the non-XML format into XML > > There are individual cases of this sort of thing[1] but I would like to see > some sort of co-operative effort to produce a large number of these things. > I'm not envisaging complex filters, just a simple XML representation of the > non-XML format so that purely XML tools like editors, query engines, XSL > engines can operate on non-XML formats. There are plently of applications > including generation of these files on the basis of other XML documents (I > need this for Makefiles on my websites) and literate programming. > > I would personally find great value in this being done for Makefiles, > procmail files, simple shell scripts and PalmPilot databases. Others of > value I can think of include Windows INI files, Unix mailboxes, your > favourite programming language... > > If there is enough interest I am more than willing to coordinate these > efforts. Just let me know. > > James > > [1] http://www.xmlsoftware.com/convert/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Mar 30 17:18:45 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:51 2004 Subject: Fw: XML query language Message-ID: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il> Paul Janssens wrote: >In my opinion, an xml query language should only describe a set of >equations, an xml query language implementation should only solve these >equations, and whatever is done with the result is NO business of the >query language. Just to make sure I follow: you'd prefer that there would be a standard DTD, so that results would always be created in an XML format containing references to the matched XML elements (XLink/XPointer?). The user would then filter this through XSL or whatever to display the results. Nice separation of concerns, but I see several objections: - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list of matches scattered all over the place. In the current approach, the DB would both resolve the matches and extract the necessary data, potentially at the same pass using a lot of locality-of-reference optimizations. In your method a second tool would re-fetch the references in a second phase, which would probably double the cost of doing the query. - Power. Assume that I hypnotize all the W3C members to adopt the XSL transformational part as XQL version 1.0 :-) This is more powerful then current ?QL proposals because it allows for an to call - that is, to perform nested queries (and therefore, BTW, offers a natural way to do joins without variables, and solves other ?QL problems). All this works because XSL has a rich language for constructing the results. In your approach, you won't be able to do a lot of that; you'd end up adding special constructs for them, duplicating XSL's capabilities in an incompatible language. Of course you'd be in good company - that is what all the other ?QL language proposals do :-) - Convenience. It is easier to specify a query as just "one thing" instead of two. Note that even if ?QL == XSL transformation, it still makes a lot of sense to filter its results through another XSL stylesheet for presentation in most cases. Even lazy users will do so - if, for example, they had already available XSL sheets for displaying certain types of results. So all in all I prefer my approach: XQL = XSL - FO. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Mar 30 18:14:58 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:51 2004 Subject: Fw: XML query language References: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il> Message-ID: <3700F24E.3296C03F@prescod.net> Oren Ben-Kiki wrote: > > Paul Janssens wrote: > >I think (iii) > (results should be XML) > >should not be a requirement of an XML query language. The > >result of a query could be a vector of tuples of pointers to the > >individual matches. Whatever needs to be done with that output can be > >done in a layer above that. > > I fail to see the benfit in inventing a new format for query results. It isn't about a format. Query languages do not typically work on formats. They have an input data model (i.e. a relational data base) and they have an output model (i.e. a set of records). An XML Query Language should also work in terms of the XML data model (the information set). > First, > a set of tuples with pointers, or whatever else, can be easily expressed in > XML. Second, if one wants to obtain 'pointers to the output', then it should > be a simple matter of constructing in the result a pointer to the matched > tree ( or something) instead of the matched tree itself. The IDL for an XML QL should be something like: NodeList XMLQuery( DOC doc, String query ) Your alternative is: String XMLQuery( String inputdoc, String query ) or DOM XMLQuery( DOM inputdoc, String query ) That's just forcing the query engine to do more work -- much of it unnecessary in most cases. Let's put it this way: you are saying that the query engine should build a list of pointers, build a tree, generate XPointer attributes just so that an application can get back the original list of pointers! If the application wants to turn the list of pointers into a tree, it can do so. That's what XSL does. > AFAIK all XML QL proposals produce XML as output. No, XQL goes out of its way to NOT require that the output be XML. "The specification does not indicate the output format. The result of a query could be a node, a list of nodes, an XML document, an array, or some other structure. That is, XQL does not dictate the binary format of the returns, but rather the logical returns." The same is true of XSL patterns. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 30 18:18:06 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:51 2004 Subject: SAX2: DTDDeclHandler (minimalist position) Message-ID: <003001be7ac9$7561a020$46026982@thing1> From: David Megginson >Yes, but it's also no good having a named constant that you cannot use >in a switch statement. Unfortunately, Java is broken here, and you >have to choose one side or another Using objects for constants can also cause problems with persistent data, if you were depending on a singularity and testing with ==. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Mar 30 18:25:15 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:10:51 2004 Subject: XML to Text questions Message-ID: <93CB64052F94D211BC5D0010A80013310EB3CB@WWMESS3.172.19.125.2> > What tools are available for translation of XML into text? You could try SAXON, if its XSL can't produce your "physics syntax", you can augment it with a few Java element handlers. Mike Kay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/f7fb59d5/attachment.htm From paul.janssens at skynet.be Tue Mar 30 18:40:12 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:51 2004 Subject: XML query language References: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il> Message-ID: <3700FDB6.626E@skynet.be> Oren Ben-Kiki wrote: > > Paul Janssens wrote: > >In my opinion, an xml query language should only describe a set of > >equations, an xml query language implementation should only solve these > >equations, and whatever is done with the result is NO business of the > >query language. > > Just to make sure I follow: you'd prefer that there would be a standard > DTD, so that results would always be created in an XML format > containing references to the matched XML elements (XLink/XPointer?). The > user would then filter this through XSL or whatever to display the results. correct > Nice separation of concerns, but I see several objections: > > - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list > of matches scattered all over the place. In the current approach, the DB > would both resolve the matches and extract the necessary data, potentially > at the same pass using a lot of locality-of-reference optimizations. In your > method a second tool would re-fetch the references in a second phase, which > would probably double the cost of doing the query. That's an implementation issue. You can build a tool that has an input of both the query and the style description, and optimizes the DB acces. In other words xml report syntax = xml query syntax + xml style syntax does NOT imply xml report implementation = xml query implementation + xml style implementation > - Power. Assume that I hypnotize all the W3C members to adopt the XSL > transformational part as XQL version 1.0 :-) This is more powerful then > current ?QL proposals because it allows for an to call > - that is, to perform nested queries (and therefore, > BTW, offers a natural way to do joins without variables, and solves other > ?QL problems). All this works because XSL has a rich language for > constructing the results. In your approach, you won't be able to do a lot of > that; you'd end up adding special constructs for them, duplicating XSL's > capabilities in an incompatible language. Of course you'd be in good > company - that is what all the other ?QL language proposals do :-) I have no problem with recycling some XSL syntax into ?QL where applicable, in fact it would be a good idea. Just as you could recycle XPointer syntax where applicable. > - Convenience. It is easier to specify a query as just "one thing" instead > of two. Note that even if ?QL == XSL transformation, it still makes a lot of > sense to filter its results through another XSL stylesheet for presentation > in most cases. Even lazy users will do so - if, for example, they had > already available XSL sheets for displaying certain types of results. The report syntax will allow you to either link to a query and style, or describe them inline, e.g. ... Paul Janssens - paul.janssens@skynet.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Mar 30 18:45:47 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:51 2004 Subject: XML query language Message-ID: <00fe01be7acc$22045960$5402a8c0@oren.capella.co.il> Paul Prescod wrote: >Oren Ben-Kiki wrote: >> I fail to see the benfit in inventing a new format for query results. > >It isn't about a format. Query languages do not typically work on formats. >They have an input data model (i.e. a relational data base) and they have >an output model (i.e. a set of records). An XML Query Language should also >work in terms of the XML data model (the information set). Agreed. >Let's put it this way: you are saying that the query engine should build a >list of pointers, build a tree, generate XPointer attributes just so that >an application can get back the original list of pointers! I don't follow. You yourself have said: >The IDL for an XML QL should be something like: > >NodeList XMLQuery( DOC doc, String query ) Well, them, what other way is to return a list of XPointers then to store each in an "element"? This is assuming that you prefer the query engine to return a list of pointers as a result, which I don't. The one nice thing about this scheme is that you can add extra data per XPointer - a relevancy score, for example. I did mistakenly say: >> AFAIK all XML QL proposals produce XML as output. > >No, XQL goes out of its way to NOT require that the output be XML. "The >specification does not indicate the output format. The result of a query >could be a node, a list of nodes, an XML document, an array, or some other >structure. That is, XQL does not dictate the binary format of the returns, >but rather the logical returns." The same is true of XSL patterns. I stand corrected. I think you've phrased it perfectly above - the output should be defined in the terms of the XML data model. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shyutz at ms1.hinet.net Tue Mar 30 19:38:05 1999 From: shyutz at ms1.hinet.net (Kevin Hsu) Date: Mon Jun 7 17:10:51 2004 Subject: SGML and XML Message-ID: Hi, I know the XML is the subset of SGML , and SGML is more complex and detail , but I must write a paper to tell the difference, who can tell me the major difference between the SGML and XML, or where can I find information , thanks in advance! Kevin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Mar 30 19:38:41 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:51 2004 Subject: Fw: XML query language Message-ID: <010f01be7ad3$85632cf0$5402a8c0@oren.capella.co.il> Paul Janssens wrote: >I wrote: >> Nice separation of concerns, but I see several objections: >> >> - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list >> of matches scattered all over the place. In the current approach, the DB >> would both resolve the matches and extract the necessary data, potentially >> at the same pass using a lot of locality-of-reference optimizations. In your >> method a second tool would re-fetch the references in a second phase, which >> would probably double the cost of doing the query. > >That's an implementation issue. You can build a tool that has an input >of both the query and the style description, and optimizes the DB acces. >In other words > >xml report syntax = xml query syntax + xml style syntax > >does NOT imply > >xml report implementation = xml query implementation + xml style >implementation We are not talking about a "report tool". I think it would be a very rare application which would do an XML query and would only be interested in pointers to the result, without requiring any data from that pointer. If what you call a "report tool" is integrated into the query tool, always, it hardly makes sense to make the distinction; if it isn't, then "non-report" application will get the performance penalty hit. >> - Power. Assume that I hypnotize all the W3C members to adopt the XSL >> transformational part as XQL version 1.0 :-) This is more powerful then >> current ?QL proposals because it allows for an to call >> - that is, to perform nested queries (and therefore, >> BTW, offers a natural way to do joins without variables, and solves other >> ?QL problems). All this works because XSL has a rich language for >> constructing the results. In your approach, you won't be able to do a lot of >> that; you'd end up adding special constructs for them, duplicating XSL's >> capabilities in an incompatible language. Of course you'd be in good >> company - that is what all the other ?QL language proposals do :-) > >I have no problem with recycling some XSL syntax into ?QL where >applicable, in fact it would be a good idea. Just as you could recycle >XPointer syntax where applicable. If we agree that an XQL match pattern should be used to select elements in the DB and that XSL syntax should be used to specify what the XML result data should be, don't we end up with XSL? Think of it another way. Suppose we agree to use: Other tags for constructing the results... Then what is the difference between and and and ? Why bother having both? Maybe it would be clearer if we thought about it this way: what feature of XQL isn't useful in the transformational part of XSL, or vice versa? I can't think of any. IMVHO both are _applications_ of the general XML -> XML conversion problem, and any feature relevant for this problem will be relevant for both. >> - Convenience. It is easier to specify a query as just "one thing" instead >> of two. Note that even if ?QL == XSL transformation, it still makes a lot of >> sense to filter its results through another XSL stylesheet for presentation >> in most cases. Even lazy users will do so - if, for example, they had >> already available XSL sheets for displaying certain types of results. > >The report syntax will allow you to either link to a query and style, or >describe them inline, e.g. > > > ... > > > Not nearly as convenient. In the query part you'd specify match patterns for the DB, which automatically generate a list of pointers. You'd then specify in the style section match patterns for entries in this list, which somehow dereference them, and then proceed to match on the resulting trees to generate FO objects (or whatever). There's both extra complexity for the query writer and for the implementation which needs to figure out how to do this in one pass for efficiency. Does this really have any benefit over matching elements in the DB and directly specifying which "near-by" elements are of interest using normal XSL syntax? You would have the option of integrating the transformation to FOs (or CSS) into this XSL (useful for ad-hoc queries and specialized applications) or feeding the results to another XSL stylesheet for display (probably one independent of the query, and fitting a particular display media or format). Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Mar 30 20:22:55 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:51 2004 Subject: Fw: XML query language and another OS/XML suggestion References: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il> Message-ID: <37011E5F.A22DF23B@lig.net> I don't have a strong opinion yet on xml query languages and results, however: Maybe the results could be the XLink/XPointer AND the contents that it points to. That way you have a canonical reference but also get the contents for efficiency. Offhand, for many situations, especially database queries, I think that it will be difficult to Always generate a reasonable XLink/XPointer. With SQL for instance, it is quite common to create result strings from multiple fields and transformations. Additionally, results will often be ephemeral snapshots of something (stats, processes, connections from /dev/proc/xml for instance) that have no future reference. Maybe this can be solved by just having a 'dead-end' link value to communicate these situations as meta-data. A note on the /dev/proc/xml mention: I've been thinking for a while that EVERY data/meta-data interface to a typical OS (such as Linux/Unix) should have an XML form. Maybe add or override -X or --XML to all commands where it could possibly make sense. ps, netstat, lsof, ifconfig, df, egrep, ls, etc. are all good candidates. Add simple tree/value extraction to bash and you'd have more portability for a lot of things. sdw Oren Ben-Kiki wrote: > Paul Janssens wrote: > >In my opinion, an xml query language should only describe a set of > >equations, an xml query language implementation should only solve these > >equations, and whatever is done with the result is NO business of the > >query language. > > Just to make sure I follow: you'd prefer that there would be a standard > DTD, so that results would always be created in an XML format > containing references to the matched XML elements (XLink/XPointer?). The > user would then filter this through XSL or whatever to display the results. > > Nice separation of concerns, but I see several objections: > > - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list > of matches scattered all over the place. In the current approach, the DB > would both resolve the matches and extract the necessary data, potentially > at the same pass using a lot of locality-of-reference optimizations. In your > method a second tool would re-fetch the references in a second phase, which > would probably double the cost of doing the query. > > - Power. Assume that I hypnotize all the W3C members to adopt the XSL > transformational part as XQL version 1.0 :-) This is more powerful then > current ?QL proposals because it allows for an to call > - that is, to perform nested queries (and therefore, > BTW, offers a natural way to do joins without variables, and solves other > ?QL problems). All this works because XSL has a rich language for > constructing the results. In your approach, you won't be able to do a lot of > that; you'd end up adding special constructs for them, duplicating XSL's > capabilities in an incompatible language. Of course you'd be in good > company - that is what all the other ?QL language proposals do :-) > > - Convenience. It is easier to specify a query as just "one thing" instead > of two. Note that even if ?QL == XSL transformation, it still makes a lot of > sense to filter its results through another XSL stylesheet for presentation > in most cases. Even lazy users will do so - if, for example, they had > already available XSL sheets for displaying certain types of results. > > So all in all I prefer my approach: XQL = XSL - FO. > > Share & Enjoy, > > Oren Ben-Kiki > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Mar 30 20:29:02 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:51 2004 Subject: XML query language and another OS/XML suggestion Message-ID: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il> Stephen D. Williams wrote: >A note on the /dev/proc/xml mention: I've been thinking for a while that EVERY data/meta-data >interface to a typical OS (such as Linux/Unix) should have an XML form. Maybe add or override >-X or --XML to all commands where it could possibly make sense. ps, netstat, lsof, ifconfig, >df, egrep, ls, etc. are all good candidates. Add simple tree/value extraction to bash and >you'd have more portability for a lot of things. Wouldn't that be great? The UNIX pipe model has suffered from not having a standard structured format, as has the /proc file system. Not to mention what this could do to an OS like Plan9 where "everything is a file" and textual formats abound... However this would be a major undertaking. Maybe someone in the GNU project would consider it, though. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Mar 30 21:00:46 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:10:51 2004 Subject: SGML and XML References: Message-ID: <37011E5D.6CE993A1@manhattanproject.com> Kevin Hsu wrote: > > Hi, > > I know the XML is the subset of SGML , and SGML is more complex and detail , > but I must write a paper to tell the difference, > who can tell me the major difference between the SGML and XML, or where can > I find information , thanks in advance! I hope this will help: -------- Original Message -------- Subject: Advantages of XML and SGML Date: Fri, 12 Feb 1999 20:28:33 +0000 From: Clark Evans Reply-To: Clark Evans To: xml-dev@ic.ac.uk Cc: Susan Barron Susan Barron wrote: > > We have been using SGML for several years and are closely watching the > trend towards XML. Could someone please give me some examples of why > you would use XML over SGML. I know that XML is a subset of SGML. I > believe there must be some things that can be done in SGML that are not > possible in XML. Conversely, there must be somethings that XML does > better than SGML. Thank you. Since minimization is allowed in SGML, this creates situations where the meaning of document can have multiple syntatic interpretations. For instance: Can have two syntatic intererpretations: OR The DTD is required for the parser to figure out which one is the correct interpretation of the input. As such, an SGML document must have one_and_only_one DTD to resolve these syntatic ambiguities. XML restricts the syntax by eliminating these minimizations. Thus, all documents have one and only one syntatic interpretation. This dramatically reduces the complexity of the parser. Thus, a parser can be simpler to implement, and a DTD is _not_ required for parsing. This lets the DTD be used for a 100% semantic role, which is much more interesting for describing data! This is great beacuse it allows a document to conform to more than one DTD at the same time, _without_ requiring a "mother" DTD that merges all of the DTD's together. This is called "Architectures". It allows multiple meanings for the same document, depending upon the observer without requireing all of the possible observers to get together and specify a "united" DTD. However, this added flexibility, comes at a price: The syntax becomes much more restrictive. Therefore, For computer program <=> computer program communication XML is the ideal structure to use. Since it allows multiple subscribers to have their own interpretation of a data stream without changing the publishers. For human => computer communication SGML is will probably still remain as the prefered structure. The minimization features are very valueable when a human is the author of the document. Also, there is nothing saying you can't use both! If a human is going to write it by hand, perhaps SGML is better, then you can have JClark's SP use the DTD to resolve the ambiguities and produce the XML document that can be introduced into the corporate "xml bus" Hope this helps! Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Mar 30 21:01:52 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:10:51 2004 Subject: XML query language and another OS/XML suggestion References: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il> Message-ID: <37012794.E1134B00@lig.net> Yes, the pipe mechanism takes on whole new meaning with XML. It wouldn't be all that large of a job to do a lot of it. Obviously adding --XML capabilities wouldn't be that tough since it's simply adding labeling tags to output that is already formatted. Even /proc/xml wouldn't be that hard for output, a little tougher for input, but allowed input would be so restrictive that a simple regex parser would suffice for most things. Hacking bash in an appropriate way is more difficult, however there is already an sgrep (SGML grep) and external tools can handle all of this like Perl, Java, Tcl/TK. I suppose what we need is a group to start standardizing a DTD that settles what to call everything in a system (ports, network address, process/thread, user, etc.). That's probably the biggest job. sdw Oren Ben-Kiki wrote: > Stephen D. Williams wrote: > >A note on the /dev/proc/xml mention: I've been thinking for a while that > EVERY data/meta-data > >interface to a typical OS (such as Linux/Unix) should have an XML form. > Maybe add or override > >-X or --XML to all commands where it could possibly make sense. ps, > netstat, lsof, ifconfig, > >df, egrep, ls, etc. are all good candidates. Add simple tree/value > extraction to bash and > >you'd have more portability for a lot of things. > > Wouldn't that be great? The UNIX pipe model has suffered from not having a > standard structured format, as has the /proc file system. Not to mention > what this could do to an OS like Plan9 where "everything is a file" and > textual formats abound... > > However this would be a major undertaking. Maybe someone in the GNU project > would consider it, though. > > Have fun, > > Oren Ben-Kiki > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- OptimaLogic - Finding Optimal Solutions Web/Crypto/OO/Unix/Comm/Video/DBMS sdw@lig.net Stephen D. Williams Senior Consultant/Architect http://sdw.st 43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Mar 30 21:03:01 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:52 2004 Subject: XML query language References: <00fe01be7acc$22045960$5402a8c0@oren.capella.co.il> Message-ID: <370114EA.2525C481@prescod.net> Oren Ben-Kiki wrote: > > I don't follow. You yourself have said: > > >The IDL for an XML QL should be something like: > > > >NodeList XMLQuery( DOC doc, String query ) > > Well, them, what other way is to return a list of XPointers then to store > each in an "element"? You don't need an element. You just need a nodelist. Look at the DOM's brutally named "getElementsByTagName" method. Also consider the XSL specification: > A select pattern must match the production for SelectExpr; it returns > the list of nodes that results from evaluating the SelectExpr with the > current node as context; the nodes are in the list are in document order. XPointer is interesting because it doesn't support either interpretation: "The result of a spanning selection cannot generally be expressed as a well-formed XML document, nor as a node or list of nodes from an element tree." -- If you are asking me what is the syntax for a nodelist then I'll say it has no syntax. It is an abstraction like the record set returned by a database. If you have to move the query result between machines then you can choose an encoding (quite likely XML) but that's outside of the realm of the query language itself -- it is akin to report writing. If you aren't moving data between processes then you shouldn't be forced to encode it in XML (even a DOM). This is just a general principle that applies here. > This is assuming that you prefer the query engine to > return a list of pointers as a result, which I don't. The one nice thing > about this scheme is that you can add extra data per XPointer - a relevancy > score, for example. I'm not convinced that this is the domain of the query language, but even if it is then you are asking for an annotated nodelist, not a DOM. If we do decide to go ahead with annotated nodelists then we would have to add that to the XML data model. That still doesn't have anything to do with generating XML elements: *unless the application wants to do so*. > I stand corrected. I think you've phrased it perfectly above - the output > should be defined in the terms of the XML data model. And that model has a concept of nodelist -- this is the most appropriate return value for query results. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Mar 30 21:15:37 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:52 2004 Subject: Fw: XML query language References: <010f01be7ad3$85632cf0$5402a8c0@oren.capella.co.il> Message-ID: <370117AE.B5E009AE@prescod.net> Oren Ben-Kiki wrote: > > We are not talking about a "report tool". I think it would be a very rare > application which would do an XML query and would only be interested in > pointers to the result, without requiring any data from that pointer. How about deletion? How about changes to the nodes? How about reports of nodes that have changed? > If > what you call a "report tool" is integrated into the query tool, always, it > hardly makes sense to make the distinction; if it isn't, then "non-report" > application will get the performance penalty hit. You are conflating implementation with language specification. XPointer is a query language that can be used separately from XLink. Does that mean that XLink implementations have taken a performance hit? No, because you can choose to integrate XPointer and XLink in a loose way (xptr_filter | xlink_filter ) or you can choose to implement them tightly. Your choice. > If we agree that an XQL match pattern should be used to select elements in > the DB and that XSL syntax should be used to specify what the XML result > data should be, don't we end up with XSL? When you combine the query language with the report generation language you end up with something very like XSL, yes. But you could use the two separately. You could embed another query language into XSL (in a perfect world) and use the query language in another style language or non-style application. > Think of it another way. Suppose > we agree to use: > > > Other tags for constructing the results... Right. That's why XQL doesn't have tags for constructing the results. It leaves that up to XSL, or Python or whatever it is embedded in. > Maybe it would be clearer if we thought about it this way: what feature of > XQL isn't useful in the transformational part of XSL, or vice versa? I can't > think of any. IMVHO both are _applications_ of the general XML -> XML > conversion problem, and any feature relevant for this problem will be > relevant for both. No, XQL has nothing to do with conversion. If I use it to locate nodes in the tree before deleting them, where is the conversion? Imagine a command line: XQL_locate database '/foo/bar["baz"]' | Node_Delete The language passed between those two commands might be XML. It also might not. Maybe it is just a list of UUIDs. Maybe it is the offset of the node into the database store. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 30 23:34:09 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:52 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> Message-ID: <3701411C.784A1D4A@eng.sun.com> One lesson: most web servers should default to using the "application/xml" MIME content type, not "text/xml"! Chris Lilley wrote: > > What this RFC appears to do is remove author control over correctly > labelling the encoding, and ensure that most if not all XML documents > get incorrectly labelled as US-ASCII. Not at all. The best default MIME content type for all web servers is "application/xml". Without a "charset=Big5" or similar declaration, then the XML processor's autodetection kicks in ... minimally handling UTF-8 and UTF-16, and quite commonly handling a variety of additional encodings. For example, Sun's XML processor handles about 140 encodings at last count ... and _does_ conform to RFC 2376. > So, this RFC removes at a stroke the possibility of authors correctly > labelling the encoding of their XML documents and takes us back to that > dark time (the present) when the majority of, say, Japanese Web content > was mis-labelled. And it seems to have done this simply to save a very > small part of coding effort for people writing transcoders. Again, no it doesn't. The idea is to get the web server to attach the correct MIME content type, which is NOT "text/xml" in many/most cases. Authors must rely on the administrator not breaking their content, and this is part of it. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From leventer at uol.com.br Wed Mar 31 03:34:39 1999 From: leventer at uol.com.br (=?iso-8859-1?Q?Maur=EDcio_Leventer?=) Date: Mon Jun 7 17:10:52 2004 Subject: unsubscribe leventer@uol.com.br Message-ID: <001701be6d48$fe3658c0$b299d3c8@leventeruol.com.br> unsubscribe leventer@uol.com.br xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Mar 31 05:07:58 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:52 2004 Subject: XML <-> non-XML filter project Message-ID: <002a01be7b1b$977e1ab0$44f96d8c@NT.JELLIFFE.COM.AU> From: James Tauber >What I would like to see is people taking existing non-XML formats and >developing: > > a) a URI for the non-XML format (for notations and for the namespace of >the XML format) > b) a DTD representing the existing non-XML format > c) an output filter to convert documents conforming to the DTD into the >non-XML format > d) (possibly) an input filter to convert the non-XML format into XML There is a project somewhat like this through FSF: the "GNU Filters". They have an Excel to XML filter now, from memory. I think this is a good project to support. http://www.fsf.org/ Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Mar 31 06:54:49 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:10:52 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <3701411C.784A1D4A@eng.sun.com> Message-ID: <199903310453.AA00111@archlute.apsdc.ksp.fujixerox.co.jp> David Brownell wrote: > Again, no it doesn't. The idea is to get the web server to > attach the correct MIME content type, which is NOT "text/xml" > in many/most cases. Authors must rely on the administrator > not breaking their content, and this is part of it. "application/xml" is appropriate for some XML data. On the other hand, if you do not want to miss fallback to text/plain, "text/xml" is the right choice. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Mar 31 10:17:57 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:10:52 2004 Subject: XML query language and another OS/XML suggestion References: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il> Message-ID: <00fe01be7b4e$a509c8e0$0300000a@cygnus.uwa.edu.au> > Wouldn't that be great? The UNIX pipe model has suffered from not having a > standard structured format, as has the /proc file system. Not to mention > what this could do to an OS like Plan9 where "everything is a file" and > textual formats abound... This is pretty much what I was suggesting a little while ago on this list (in the same breath as ?berdocument). I'm still trying to find the time to work a bit more on it. I certainly have a lot of ideas about it so if others are interested in helping with implementation, I'd love them to drop me an email. My idea involves a layer on top of the operating system that treats the operating system as one big XML document (hence the phase "?berdocument shell" which I used at the time). I'm thinking of calling it "Plan X" which both includes the mandatory "X" for association with XML and suggests, via roman numeral, a continuation of the thinking of Plan 9. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From branjan at wipinfo.soft.net Wed Mar 31 10:32:31 1999 From: branjan at wipinfo.soft.net (Balaji Ranjan) Date: Mon Jun 7 17:10:52 2004 Subject: snmp in XML Message-ID: hi, is there a snmp representation in XML not using the CIM standard but representing mib in a XML way regards Balaji Ranjan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Mar 31 10:40:54 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:10:52 2004 Subject: XML query language Message-ID: Paul Prescod wrote: > And that model has a concept of nodelist -- this is the most > appropriate return value for query results. What do you mean by nodelist? Does it take into account that result nodes may be returned from different parts of the tree, or even at different depths? It would be quite inefficient to encode the entire path of each node and just list each result. We use a variation on the fragment spec that allows both of these conditions to be met, for example:
Mark ...
Mark ...
[Note that the ID/IDREF part is not in the fragment spec. Only one fragbody/page pair is allowed.] I think the useful things about the fragment spec are: - the initial query is encoded in the container of the results (fragbodyref) - you get the context of your results set. An application could now modify these results - say add a paragraph of text - and have enough info to do the work - nodes could be returned from anywhere in the hierarchy - a remote application could keep its own DOM model of the hierarchy, and only request nodes it needs as and when it needs them In fact, we love it so much that we use it for everything that is returned from our server! Even one article is returned as a fragment. Interested to know what people think of this approach. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Wed Mar 31 12:17:16 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:10:52 2004 Subject: Fw: XML query language Message-ID: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il> Paul Prescod wrote: >I wrote: >> Well, them, what other way is to return a list of XPointers then to store >> each in an "element"? > >You don't need an element. You just need a nodelist. Look at the DOM's >brutally named "getElementsByTagName" method. You mean the NodeList contains the matched nodes directly, and not XPointers which point to them. Presumably these nodes can be used to either access to tree in the vicinity of the match, or to obtain other data regarding the node (such as a fast ID for direct access to a DB record or whatever). That does makes more sense then Paul Jenssens' proposal (returning XPointers). >If you are asking me what is the syntax for a nodelist then I'll say it >has no syntax. It is an abstraction like the record set returned by a >database. If you have to move the query result between machines then you >can choose an encoding (quite likely XML) but that's outside of the realm >of the query language itself -- it is akin to report writing. No standard way to represent a query result as text? I find this strange. But if the result is a nodes list, wouldn't fragments somehow resolve this? After all each node is a fragment... >If you aren't moving data between processes then you shouldn't be forced >to encode it in XML (even a DOM). This is just a general principle that >applies here. The output of an XSL processor is also not forced to be encoded in XML (it might be a DOM or even a display on the screen), but it is very helpful to have a standard XML encoding for it (witness the current XSL implementations). Shouldn't the same hold for XQL? And in a separate message: >> Think of it another way. Suppose >> we agree to use: >> >> >> Other tags for constructing the results... > >Right. That's why XQL doesn't have tags for constructing the results. It >leaves that up to XSL, or Python or whatever it is embedded in. Both XML-QL and XQL have ways to construct results (CONSTRUCT and ). I feel that _if_ XML is to be constructed as a result of an XML query then XSL is the language to do so; there's no need to invent a new construction language. Can we agree on this? >No, XQL has nothing to do with conversion. If I use it to locate nodes in >the tree before deleting them, where is the conversion? Imagine a command >line: > >XQL_locate database '/foo/bar["baz"]' | Node_Delete > >The language passed between those two commands might be XML. It also might >not. Maybe it is just a list of UUIDs. Maybe it is the offset of the node >into the database store. OK, if what you are saying is: - We have two languages: (i) matching of XML elements, which we'll call XQL for the moment, and is basically the XSL match pattern language; (ii) constructing XML trees from other XML trees which we'll call XTL for the moment and is basically the tags. - XSL is the combination of both (plus FO objects). - XQL is usable in other contexts then XTL. - There's no other standard XML construction syntax other then XTL. Then we agree. I'd also add: - We should have separate specs for XQL, XTL, and FOs. The XTL spec should simply reference the XQL spec. The FO spec should be independent. - XQL should be used wherever a set of XML elements needs to be selected from an XML tree. - So therefore CSS should allow using XQL in its selectors. For that matter, CSS should allow an XML syntax :-) - And also XPointers? Actually, what is the difference between XPointer syntax and XQL (as defined above)? Both allow matching elements according to the structure of the XML tree and/or the value of attributes. The syntax is different and the set of capabilities doesn't exactly match. Is it just due to historical reasons that XSL isn't using (possibly enhanced) XPointers in its match patterns? Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Wed Mar 31 13:01:03 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:10:52 2004 Subject: Fw: XML query language In-Reply-To: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il> Message-ID: <199903311059.MAA19602@sonne.darmstadt.gmd.de> Oren Ben-Kiki wrote at 31 Mar 99, 12:12: > Paul Prescod wrote: > >I wrote: > >> Well, them, what other way is to return a list of XPointers then to store > >> each in an "element"? > > > >You don't need an element. You just need a nodelist. Look at the DOM's > >brutally named "getElementsByTagName" method. What a XML query should return depends on what the results are needed for. There is no such think as "the right way" to use an XML query language. Look who was on the W3C-QL workshop '98 and what they asked for: 1. Information Retrieval XML seen as: Collection of text documents Formalisms offered: Z39.50, RDF, WebSQL, PAT, ... Query result needed: References to relevant documents 2. WWW information systems XML seen as: Abstraction of heterogenous data sources and services Formalisms offered: HTTP, CGI, URI Query result: Integrated data sources and services 3. Database community (both rleational and OO): XML seen as: Set of structured facts (order doesn't matter) Formalisms offered: SQL, OQL Query result: Set of (re)structured facts (order doesn't matter) 4. Document processors XML seen as: Structured text (order matters) Formalisms offered: XSL selectors, Query result: Pointers to selected text fragments (order matters) for further processing (e.g. by XSL templates or programming languages) 5. Document transformation XML seen as: Syntax tree Formalisms offered: hedge automata Query result: Transformed syntax tree 6. Hypertext community XML seen as: Graph of structured nodes connected by Hyperlinks Formalisms offered: XLink, XPointer Query result: Locations within a structured node All of those need a QL. But all have different constraints (e.g. Hypertext needs a QL to fit in URL) and want different results (pointers to documents vs. documents vs. restructured documents). David Maier identified five fundamental operations in XML queries: 1. Selection of elements depending on content, structure or attributes 2. Extraction of elements 3. Redution of elements 4. Restructuring of documents 5. Combination of elements Looking at the user groups, e.g. neither Hypertext nor information retrieval will need restructuring or combination. Document processing will need all 5 operations. Right now XQL offers operations 1-3, XSL offers operations 1-4 and XML-QL offers operations 1-5 (with the cost of loosing order). You suggest to use XPointers as the result of XML queries. XPointers from my point of view are queries by themselves. Being from the database community, I want restructured XML as a result. Who is right ? No one. It just depends on the way you look at it. ++im -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.janssens at skynet.be Wed Mar 31 13:45:21 1999 From: paul.janssens at skynet.be (Paul Janssens) Date: Mon Jun 7 17:10:52 2004 Subject: Fw: XML query language References: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il> Message-ID: <37020A2D.47DF@skynet.be> Oren Ben-Kiki wrote: >... > > You mean the NodeList contains the matched nodes directly, and not XPointers > which point to them. Presumably these nodes can be used to either access to > tree in the vicinity of the match, or to obtain other data regarding the > node (such as a fast ID for direct access to a DB record or whatever). That > does makes more sense then Paul Janssens' proposal (returning XPointers). The Xpointer proposal was one for a textbased result of the query (a standard DTD if you like), but at an API level, you just need references to the live node(s), that's obvious. Paul Janssens - paul.janssens@skynet.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From branjan at wipinfo.soft.net Wed Mar 31 16:30:00 1999 From: branjan at wipinfo.soft.net (Balaji Ranjan) Date: Mon Jun 7 17:10:52 2004 Subject: hi Message-ID: hi all, has anybody got a consolidated archive of xml examples in the list or outside.kindly pass it on to me,so that i can learn more abt. using xml thanks and regards Balaji Ranjan Wipro infotech B'lore Web Biz Card: http://eCode.com/?brn xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Mar 31 16:31:56 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:10:52 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <003001be7ac9$7561a020$46026982@thing1> Message-ID: At 11:22 AM -0500 3/30/99, Bill la Forge wrote: > > >Using objects for constants can also cause problems with persistent >data, if you were depending on a singularity and testing with ==. > This isn't a problem with the syntax I've described because there is only a fixed set of objects in which identity comparisons are the same as equality comparisons. The issue of switch statements is a little more serious. However, you can always use if-else. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 31 17:52:18 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:52 2004 Subject: SAX2: DTDDeclHandler (minimalist position) Message-ID: <000b01be7b8f$22edc460$c8a8a8c0@thing1> From: Elliotte Rusty Harold >>Using objects for constants can also cause problems with persistent >>data, if you were depending on a singularity and testing with ==. >> > >This isn't a problem with the syntax I've described because there is only a >fixed set of objects in which identity comparisons are the same as equality >comparisons. How do you maintain singularities when deserializing a JavaBean which contains a reference to one of these objects? That is to say, you have a constant which references an object. No problem. Now you have a bean with a variable which has been assigned the constant value. No problem. Now you save the bean. No problem. Now you deserialize the bean. No problem. Now you test the value of the variable in the bean with ==. Woops. The test always returns false. Conclusion: using objects for constants is great unless you are using Java Serialization or almost any other kind of persistance. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mik at owl.co.uk Wed Mar 31 18:19:22 1999 From: mik at owl.co.uk (Michael Ewins) Date: Mon Jun 7 17:10:52 2004 Subject: Attribute-Value Normalisation Message-ID: <041201be7b91$7fd0b050$2096c9c2@mik-ppro.owl.co.uk> Can someone help me or point me toward appropriate FAQ. In the XML specification it says attribute values will be normalised. However, the explanation isn't clear to me so I'll try to clear up my understanding through an example. I have an element MYDOC and this has an attribute that references a filename which is CDATA. An example document might read Will this attribute be normalised if it contains any whitespace? For example, "space morepsace.txt" is a valid filename on Windows but I need to know if XML will attempt to normalise the multiple whitespace to "space morespace.txt" or something equally wrong. Essentially my question is, if I have a valid filename (on Windows or Mac) that I use as an attribute value in XML can I be sure the parser will pass this on unchanged? thanks in advance for any help... -- Michael Ewins Panasonic OWL -- mik@owl.co.uk -- http://www.owl.co.uk home -- michael_ewins@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Mar 31 19:45:41 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:10:53 2004 Subject: Attribute-Value Normalisation References: <041201be7b91$7fd0b050$2096c9c2@mik-ppro.owl.co.uk> Message-ID: <00d101be7b9d$e63ea660$0300000a@cygnus.uwa.edu.au> > [...] > > > Will this attribute be normalised if it contains any whitespace? If it is declared CDATA, the only whitespace normalization is that carriage-returns, line-feeds and tabs are normalized to spaces (with a CR+LF being normalized to only a single space) Only if it were *not* CDATA would multiple spaces be normalized to one. [...] > Essentially my question is, if I have a valid filename (on Windows or Mac) that > I use as an attribute value in XML can I be sure the parser will pass this on > unchanged? As long as the filename didn't contain a < or & that you include literally. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 31 20:07:19 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:53 2004 Subject: XML query language References: Message-ID: <37025CF2.D186E248@prescod.net> Mark Birbeck wrote: > > Paul Prescod wrote: > > And that model has a concept of nodelist -- this is the most > > appropriate return value for query results. > > What do you mean by nodelist? Does it take into account that result > nodes may be returned from different parts of the tree, or even at > different depths? Sure. A node list is a list of nodes. No more, no less. > It would be quite inefficient to encode the entire > path of each node and just list each result. Query languages have nothing to do with encodings. That's the point I'm trying to make. If you want to make a "query results encoding language" -- great. Ideally it would work with the results returned by *any query language*. But you *must* be able to use the query language without the query encoding language -- i.e. in the middle of a Python or Java program, in a stylesheet, in a GUI. > In fact, we love it so much that we use it for everything that is > returned from our server! Even one article is returned as a fragment. > > Interested to know what people think of this approach. It looks good for the special case where the query results must be communicated between processes. It isn't useful for the other cases. In the middle of my Python or Java program I'm certainly not going to do a query and then re-parse the results. The results should be returned as a list of PyObject or java.lang.object references. Summary: If the query language is going to have maximum usefulness it must not specify that the results must be encoded in any special syntax or that they must be encoded at all. Encoding results is another important but separate issue (just as it is SQL). -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 31 20:35:20 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:53 2004 Subject: XQL and XPointer Message-ID: <37026166.77965C31@prescod.net> Oren Ben-Kiki asks: > Actually, what is the difference between XPointer syntax and XQL (as defined > above)? Both allow matching elements according to the structure of the XML > tree and/or the value of attributes. The syntax is different and the set of > capabilities doesn't exactly match. Is it just due to historical reasons > that XSL isn't using (possibly enhanced) XPointers in its match patterns? Yes the reasons are mostly historical but there are technical issues. The XPointer "model" is to select a contiguous, perhaps non-well formed range of data. The XSL model is to select a list of well-formed, perhaps non-contiguous nodes. I don't think that there is anything wrong from a hypertext-theoretic point of view with having pointers return non-contiguous nodes. And you can simulate non-well-formedness: This is text and this is some more text. ^ ^ In this case we could simulate a link from the first occurence of "is" to the second by selecting the nodes "i","s"," ","t","e","x","t"," ",..., etc. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 31 20:46:38 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:53 2004 Subject: Fw: XML query language References: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il> Message-ID: <3702600B.75477042@prescod.net> Oren Ben-Kiki wrote: > > >You don't need an element. You just need a nodelist. Look at the DOM's > >brutally named "getElementsByTagName" method. > > You mean the NodeList contains the matched nodes directly, and not XPointers > which point to them. Right. Pointers to, not copies of, the nodes. And the pointers should be in the most efficient "syntax" allowed by the system. In a Python program it is a PyObject reference. In C++ it is a DOMNode *. In a process-portable XML encoding it is an XPointer. Everybody is focused on this last case but it is only a special case. > >If you are asking me what is the syntax for a nodelist then I'll say it > >has no syntax. It is an abstraction like the record set returned by a > >database. If you have to move the query result between machines then you > >can choose an encoding (quite likely XML) but that's outside of the realm > >of the query language itself -- it is akin to report writing. > > No standard way to represent a query result as text? I find this strange. I didn't say that there should be no standard way. I said that the standard way is not something that the query language should specify. If there are 6 query languages (some standardized and some proprietary) and 6 result encoding syntaxes (some standardized and some proprietary) then you should be able to use any query language with any encoding syntax. > Both XML-QL and XQL have ways to construct results (CONSTRUCT and > ). There is no such element type described in http://www.w3.org/TandS/QL/QL98/pp/xql.html > OK, if what you are saying is: > > - We have two languages: > (i) matching of XML elements, which we'll call XQL for the moment, and is > basically the XSL match pattern language; > (ii) constructing XML trees from other XML trees which we'll call XTL for > the moment and is basically the tags. > - XSL is the combination of both (plus FO objects). > - XQL is usable in other contexts then XTL. > - There's no other standard XML construction syntax other then XTL. > > Then we agree. Yes! > I'd also add: > > - We should have separate specs for XQL, XTL, and FOs. The XTL spec should > simply reference the XQL spec. The FO spec should be independent. Techically a good idea but I think that it is politically impossible to separate XSL and its matching language at this point. Maybe XSL 2.0 will depend on whatever XML QL is eventually standardized. > - XQL should be used wherever a set of XML elements needs to be selected > from an XML tree. > - So therefore CSS should allow using XQL in its selectors. For that matter, > CSS should allow an XML syntax :-) > - And also XPointers? I agree with all of this but changes to CSS are unlikely in the short->medium term. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhavaldar at str.com Wed Mar 31 22:37:40 1999 From: rhavaldar at str.com (Raghunandan Havaldar) Date: Mon Jun 7 17:10:53 2004 Subject: XML-to-Java, and Java-to-XML Message-ID: <002b01be7bb6$47a03aa0$612a96d0@raghu.STR_MILW> Hi, I am experimenting with mapping an XML document to Java object model, and vice-versa. Have a couple of things on mind - using Java Beans, Reflection mechanism and (mapper, lookup classes). I was wondering if somebody out there has worked and developed some kind of a model to do this transformation. (am sure someone has given it a try). If not, has anybody have ideas of how to go about doing it ?. Currently, the XML documents are purely in a flat file format. If the 'mapper', 'lookup' and related utility classes can be defined, the XML documents could possibly be stored in databases instead. definition: 'mapper' - maps a XML model to a Java object graph (uses Java Beans's patterns and Reflection to achieve this). Also, should be able to do Java object model to an XML model (vice-versa). 'lookup' - provides lookup of XML nodes in a DOM-based tree. I have just started scratching the surface today. Any ideas, suggestions or comments are welcome. thanks, raghu Raghu Havaldar Consultant rhavaldar@str.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Wed Mar 31 22:52:05 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:10:53 2004 Subject: XML query language In-Reply-To: <37025CF2.D186E248@prescod.net> Message-ID: <199903312050.WAA14507@sonne.darmstadt.gmd.de> Paul Prescod wrote at 31 Mar 99, 11:35: > Mark Birbeck wrote: > > > > Paul Prescod wrote: > > > And that model has a concept of nodelist -- this is the most > > > appropriate return value for query results. > > > > What do you mean by nodelist? Does it take into account that result > > nodes may be returned from different parts of the tree, or even at > > different depths? > > Sure. A node list is a list of nodes. No more, no less. An XQL query may return numbers, strings, Date objects or even user defined data types, which are not nodes in the DOM sense, but objects. If you wrap the results in tags like , ... you will get problems with the user defined types and loose type information. To me the return values of a query are just a vector of objects in document orders, of which some happen to be DOM nodes. ++im -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Mar 31 23:18:47 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:53 2004 Subject: XML-to-Java, and Java-to-XML In-Reply-To: <002b01be7bb6$47a03aa0$612a96d0@raghu.STR_MILW> Message-ID: <199903312118.QAA31121@hesketh.net> At 02:37 PM 3/31/99 -0600, Raghunandan Havaldar wrote: >Hi, > >I am experimenting with mapping an XML document to >Java object model, and vice-versa. Have a couple of >things on mind - using Java Beans, Reflection mechanism >and (mapper, lookup classes). > >I was wondering if somebody out there has worked and >developed some kind of a model to do this transformation. >(am sure someone has given it a try). Take a look at MDSAX and Coins on the JXML.com site - www.jxml.com. It sounds like it's pretty much exactly what you're looking for. You specify the mapping from elements to classes in a ContextML document, itself XML, and it builds a processing structure into which you can feed your documents to build your classes. The best part is that MDSAX/Coins handles all the weird work for you, including reflection. You can even feed documents with different vocabularies into the same structure by specifying a different ContextML document. With MDSAX it's just document->Beans, while with Coins you can go back from Beans->document. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)