From marcelo at mds.rmit.edu.au Mon Mar 1 00:49:03 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:32 2004 Subject: Streaming XML and SAX In-Reply-To: <36D82244.DB014ECE@thinlink.com>; from Tom Harding on Sat, Feb 27, 1999 at 08:50:12AM -0800 References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> Message-ID: <19990301114841.B4466@io.mds.rmit.edu.au> On Sat, Feb 27, 1999 at 08:50:12AM -0800, Tom Harding wrote: > David Megginson wrote: > > > No, it still looks like a messy architecture to me, because the > > transport layer has to know about the packets -- it has to parse > > the XML about to get information about what it's looking at, and > > that adds complexity and inefficiency. A clean architecture > > should separate the layers completely, and use XML only where it > > has an obvious advantage over other approaches. > > It's amazing how two people can see things so differently. I think > it's supremely elegant that only the XML processor needs to look at > data coming off the wire. It's also as efficient as it gets. Of > course the software architecture that handles the documents emitted > must be modular and extensible, but the task of parsing is done. It has already been pointed out in this discussion that some environments try to increase the throughput by dispatching documents off to different threads. A system with 50 CPU's is going to be operating as low as 2% capacity if it is forced to pipe the entire parsing load through a single thread. I don't see how you can argue that this is efficient. Nor do I agree that concentrating the workload at a single conceptual point is elegant. It is much more aesthetically pleasing to let the protocol break up packets and let the XML parser parse XML. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 02:13:44 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:32 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <36D6C618.D44846B6@thinlink.com>; from Tom Harding on Fri, Feb 26, 1999 at 08:04:40AM -0800 References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> <36D59762.370372DB@thinlink.com> <14038.35650.792155.191827@localhost.localdomain> <36D6C618.D44846B6@thinlink.com> Message-ID: <19990301131329.B6351@io.mds.rmit.edu.au> On Fri, Feb 26, 1999 at 08:04:40AM -0800, Tom Harding wrote: > David Megginson wrote: > > > -- a general-purpose DOM would be *extremely* inefficient for > > handling things like vector graphics or 3D worlds (to name only > > two), though it is always possible to expose their optimised > > object models through a DOM interface later if necessary. > > In lots of applications, the data can't stay in an XML > representation for very long anyway, because of what you're > integrating it with/displaying it on/routing it through/converting > it to/storing it in/etc... I view the DOM as a standard, OO way of > manipulating the contents of a document. It lets applications get > work done, even without taking an end-to-end OO approach. Perhaps > I'm showing my bias here ;D It's the translation process that hits hardest, however. C and FORTRAN compilers rarely build parse trees, because it is much more efficient to generate code directly from token streams. What you seem to be suggesting is that a parser should pump an event stream straight into DOM and then into another domain-specific structure. This is just adding an often gratuitous layer that can incur a massive performance penalty for large documents (a 3D model of a refinery, say). In such circumstances I would much rather build the domain-specific structure straight from the event stream. (In fact, I have serious reservations about using XML at all for 3D model transmission and storage -- the markup tends to grossly outweigh the content, which consists primarily of numbers. Compression during transport _and_ storage would be a must). Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Mar 1 03:14:12 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:32 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <000a01be6391$ddcd2750$14f96d8c@NT.JELLIFFE.COM.AU> From: Baden Hughes >I know that XML 1.0 allows you to use 'special' characters as included in >the Unicode 2.0 specification. With the upcoming release of Unicode 3.0 how >will we be able to refer to characters in 3.0 which were not in 2.0 ? The >same way (meaning the actual version of Unicode spec is irrelevant as long >as the method used is included in XML) or some new way ? > >For instance, the Sinhala character set was not in Unicode 2.0 but will be >in 3.0. How do I get one of those characters in an XML document ? Or is that >inconsequential to the document per se as it is simply a reference and its >really up to the application to render it correctly ? The document character set of XML is ISO 10646, as used by the Unicode Consortium's character set Unicode. I think most people's strong expectation is that XML will track ISO 10646, just as Unicode tracks it. In fact, I think it is essential that XML automatically tracks ISO 10646: people will always try to do strange and interesting things with characters and codes, and XML should try to allow as much freedom for them to do this as possible. Developers should be very wary of putting type-checking into their systems which will cause future legitimate ISO 10646 to fail. For example, when a new character is invented, like the Euro, the only difficulty it should cause is if the font is not upgraded or if the sort/type system doesnt allow new character registration. We certainly need to abandon the expectation the number of characters is fixed or knowable, which is how some might interpret material from Unicode Consortium: a character set standard tries to put in what is generally useful against some criteria--if your criteria do not match, then you easily legitimately decide that your character is not found in the set: is Apple's "apple" character a real character? are variant kanji characters real characters? are roman, fraktur, italic and uncial "a" characters different? Is English "W" a different character (i.e., "UU") from German "W" (i.e. "VV"), when using historical material? In my book I use a dinosaur glyph as a word have liked to have put it in the index too: why is it not a character? Such questions can never be resolved, but a character set must make a decision based on some selection criteria; and those criteria will not be appropriate in every situation. The nice thing about markup is it lets us simulate the existance of a character missing from a character set: however, we have no markup conventions yet to do this systematically. There are no standard methods for saying "when you find 'a' in this context, collate it differently" for example (apart from, perhaps, language-tagged elements). Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 05:22:48 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: Comments on WD-html-in-xml-19990224 Message-ID: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca> At 03:07 PM 2/26/99 -0500, John Cowan wrote: >1) I believe that the introduction of a media type "text/xhtml" is >a mistake. I can see this point of view. >Instead, it would be better to attach a media-type >attribute specifying the formal public identifier of the DTD. ?!? find me somewhere in a W3C or IETF document where the FPI has any standing. Standards-anality aside, this is a real problem, because there is *no interoperable resolution mechanism*. Surely you can't be serious. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 1 05:41:36 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:33 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> Message-ID: <36DA2858.43F3EA7A@thinlink.com> Marcelo Cantos wrote: > It has already been pointed out in this discussion that some > environments try to increase the throughput by dispatching documents > off to different threads. A system with 50 CPU's is going to be > operating as low as 2% capacity if it is forced to pipe the entire > parsing load through a single thread. I don't see how you can argue > that this is efficient. Even if you believe that parsing to convert markup into memory structures is slower than back-end processing, if parsing is faster than the stream itself there is no difference in the two approaches. Anyway, in the general case the question is moot because there may be inter-document dependencies, so you have to look inside the document before trying to parallelize. The whole point of this discussion was whether the document terminator ought to be XML or non-XML. Aside from the fact that I haven't yet seen a workable suggestion for a non-XML terminator, it isn't necessary to completely examine a document or convert it to a tree just to find an XML terminator. As Nathan pointed out, you could write a semi-parser to find terminators and then actually parse documents in parallel, but you'd need to suggest a way for dealing with inter-document dependencies. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Mon Mar 1 15:35:46 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:33 2004 Subject: Streaming XML and SAX In-Reply-To: <004901be6348$9c8af4a0$c9a8a8c0@thing2> Message-ID: Hi Nathan, It seems like something is backwards here! If an application is processing a series of documents, once it has a universal type name for that document (root element name + namespace), it knows how it wants to process the document and doesn't need a Pi. (What's a Gi? Is that XML?) Yes, obviously a document (or name it the way you want - I don't want to argue about streams vs documents :-) may not have any PI, may not have any name space reference. thus, only GI are then used as pattern match in this case. Sorry I forget to precise the complete resolution mechanism which is based on pattern match. thus, the router use this pattern match to dispatch to the right interpreter. Element matched are: a) PI b) name space definition c) Root GI Any of these elements could be used as a pattern match. Yes a GI is part of SGML and therefore part of XML. This is simply the element. In your example it could be something like "vendor-id". So, because the interpreter is based on a pattern matrch mechanism, everything that could be used for a pattern match can work. Actually, we use the three elements mentionned above. Also, you should be able to use the same parser for all document types and then do the routing on the parse events, saving you from having to do a "pre-parse" to determine the universal type name. Glad to see we both agree on the same mechanism. This is axactly what we do. The router mechanism is just a temporary interpreter included in the parser to load/unload the interpreters. To be precise the mechanism is: a)run the router as a special kind of interpreter b)parse the document (always) c) determine which interpreter to load then load it and let it run. d) the interpreter run until the end of the document e) at the end of the document: the router/interpreter is then loaded and run again until a new interpreter is recognized. f) got to a) The parser is always the same, only the interpreters are loaded/run and the router is just a special kind of interpreter. Do you have a more efficient mechanism to suggest? Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From boblyons at unidex.com Mon Mar 1 16:46:26 1999 From: boblyons at unidex.com (Robert C. Lyons) Date: Mon Jun 7 17:09:33 2004 Subject: Need help getting IE 5.0 Message-ID: <01BE63D8.2D163B30@cc398234-a.etntwn1.nj.home.com> IE 5.0 is no longer available on the Microsoft web site. It will be available on March 18, but I can't wait that long. I downloaded a copy of ie5setup.exe from www.download.com. When I ran ie5setup.exe, I got the following error message: "Setup was unable to download information about installation sites." (Note that the ie5setup.exe program is small, and it needs to pull many IE 5.0 components from the Microsoft web site.) Any ideas on how I can install IE 5.0 on my computer (before March 18)? Thanks. Bob ------ Bob Lyons EC Consultant Unidex Inc. 1-732-975-9877 Fax: 1-732-975-9866 boblyons(at)unidex.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Livinsb at rbos.co.uk Mon Mar 1 16:59:02 1999 From: Livinsb at rbos.co.uk (Livingstone, Stephen) Date: Mon Jun 7 17:09:33 2004 Subject: Need help getting IE 5.0 Message-ID: <217258E84FF7CF11B4630001FA44B2D502CF055A@REFROWTECX1> I have 24MB of IE5.0 files here as seperated CAB files,,, I could mail them to you if you want??(tomorrow) steven Steven Livingstone BSc MSc GradInstP Corporate Systems Development (TCN) Royal Bank Of Sctoland. mailto:livinsb@rbos.co.uk +44 0131 523 4354 [x24354] Networking Technical Associates, Glasgow, Scotland. mailto:ntw_uk@hotmail.com +44 07771-957-280 > -----Original Message----- > From: Robert C. Lyons [SMTP:boblyons@unidex.com] > Sent: Monday, March 01, 1999 4:40 PM > To: xml-dev@ic.ac.uk > Subject: Need help getting IE 5.0 > > > *** Warning : this message originates from the Internet **** > > IE 5.0 is no longer available on the Microsoft web site. > It will be available on March 18, but I can't wait that long. > > I downloaded a copy of ie5setup.exe from www.download.com. > When I ran ie5setup.exe, I got the following error message: > "Setup was unable to download information about installation sites." > > (Note that the ie5setup.exe program is small, and it needs to pull > many IE 5.0 components from the Microsoft web site.) > > Any ideas on how I can install IE 5.0 on my computer (before March > 18)? > > Thanks. > > Bob > > ------ > Bob Lyons > EC Consultant > Unidex Inc. > 1-732-975-9877 > Fax: 1-732-975-9866 > boblyons(at)unidex.com > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.. 'Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent.' xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 17:59:53 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000301be6361$272d2480$5ffa6ccb@baden> Message-ID: <36DAD563.5222F16A@locke.ccil.org> Baden Hughes wrote: > For instance, the Sinhala character set was not in Unicode 2.0 but will be > in 3.0. How do I get one of those characters in an XML document ? Or is that > inconsequential to the document per se as it is simply a reference and its > really up to the application to render it correctly ? There is a discrepancy between the prose, which says "legal Unicode/10646 characters" and references old versions of these standards, and the BNF, which says the Char production handles everything except known control characters (and even some of those). Don't worry. The problem will be resolved. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 18:24:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca> At 12:58 PM 3/1/99 -0500, John Cowan wrote: >> For instance, the Sinhala character set was not in Unicode 2.0 but will be >> in 3.0. How do I get one of those characters in an XML document ? > >There is a discrepancy between the prose, which says "legal Unicode/10646 >characters" and references old versions of these standards, and >the BNF, which says the Char production handles everything except >known control characters (and even some of those). John's right. And it's not the Sinhala that first brought it home, but the Euro character, which is clearly OK per production [2] but isn't a "legal yadda yadda yadda" per the particular amendment of 10646/Unicode that the XML spec references. The W3C has some I18n heavies trying to figure out what to do - life is made more complicated by the fact that the Unicode people and the IETF i18n people don't always point in the same direction, sigh; did you know the BOM was legal in UTF-8? And of course by the fact that Unicode/10646 is a moving target. But the bottom line is (see the public errata to the XML spec) that production [2] is normative; both in theory and in practice, XML processors pass through everything in that range. In practice, I've never actually seen anything outside of the BMP, but the experts agree they're showing up real soon now. How to get it in? Something like 𐌳 I expect. As a programmer, it'll show up either as two UTF-16 surrogates or 4+-byte UTF-8 string, neither of which will look in the slightest like hex 10333. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 18:36:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: Comments on WD-html-in-xml-19990224 References: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca> Message-ID: <36DADDF2.298060AF@locke.ccil.org> Tim Bray wrote: > ?!? find me somewhere in a W3C or IETF document where the FPI has > any standing. Standards-anality aside, this is a real problem, > because there is *no interoperable resolution mechanism*. Surely > you can't be serious. Sure I'm serious. The XHTML document (clause 3.1) gives three standard FPIs for XHTML Strict, XHTML Transitional, and XHTML Frameset, and *requires* that every strictly conforming XHTML document have a DOCTYPE that refers to one of them. The associated URL (systemid) is allowed to vary, but not the FPI. This is modeled on HTML 4.0, of course; clause 7.2 of that standard mandates the appearance of one of three FPIs as well. Similarly, HTML 3.2 (third clause) documents mandate the appearance of a single FPI, and HTML 2.0 (RFC 1866, clause 3.3) mandates the appearance of one of five FPIs. Resolution is irrelevant; it's the FPI itself that says what kind of (X)HTML you have. Table of FPIs: -//W3C//DTD XHTML 1.0 Strict//EN -//W3C//DTD XHTML 1.0 Transitional//EN -//W3C//DTD XHTML 1.0 Frameset//EN -//W3C//DTD HTML 4.0//EN -//W3C//DTD HTML 4.0 Transitional//EN -//W3C//DTD HTML 4.0 Frameset//EN -//W3C//DTD HTML 3.2 Final//EN -//IETF//DTD HTML 2.0//EN -//IETF//DTD HTML 2.0 Level 2//EN -//IETF//DTD HTML 2.0 Level 1//EN -//IETF//DTD HTML 2.0 Strict//EN -//IETF//DTD HTML 2.0 Strict Level 1//EN -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 19:10:38 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca> Message-ID: <36DAE5FA.5BA2D70E@locke.ccil.org> Timothaeus Bray scripsit: > [D]id you know the BOM was legal in UTF-8? The BOM isn't just a BOM, it's also the ZWNBSP (zero-width non-breaking space; no, I do not know how to pronounce that acronym) character, and is interpreted as a BOM only at the beginning of UCS-2 or UTF-16 documents. Not to worry; the character is as near to a no-op as Unicode allows for. > And of course by the fact that Unicode/10646 is a moving target. Only sort of. 8859-1 is theoretically a moving target too, except that all the slots are full; CP 1252 is a moving target that has just moved (by adding the euro at 0x80). In all these cases, characters can be added (in principle) but not moved or deleted (any more). > In practice, > I've never actually seen anything outside of the BMP, but the > experts agree they're showing up real soon now. Not until Unicode 4.0, unless someone wants to use the private-use planes 15 and 16. > How to get it in? Something like 𐌳 I expect. Exactly so. Or the decimal NCR equivalent. Two NCRs representing the surrogates separately would be erroneous by both Unicode/10646 definitions and XML definitions. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 19:26:18 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca> At 02:09 PM 3/1/99 -0500, John Cowan wrote: >Timothaeus Bray scripsit: > >> [D]id you know the BOM was legal in UTF-8? > >The BOM isn't just a BOM, it's also the ZWNBSP (zero-width >non-breaking space; no, I do not know how to pronounce that >acronym) character, and is interpreted as a BOM only at the >beginning of UCS-2 or UTF-16 documents. Not to worry; the character is >as near to a no-op as Unicode allows for. I think there is reason for worry. In a UTF-16 document, you can have a BOM and then the , and that PI will still be recognized as the XML declaration. The spec is, I think, pretty clear, that a ZWNBSP or any other *data* character before the XML declaration is verboten. So... it seems that in UTF8, a ZWNBSP as first character in the file isn't a data character. Blecch. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 1 19:43:07 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca> Message-ID: <36DAED75.86978455@locke.ccil.org> Tim Bray wrote: > So... it seems that in UTF8, > a ZWNBSP as first character in the file isn't a data character. Can you quote chapter and verse for this, either Unicode or 10646? The latter spec tells you that the sequence EF BB BF may be used as a *signature* at the beginning of UTF-8 data (since it is unlikely to occur in any other kind), but does not IMHO imply that the sequence is removable or doesn't represent a real ZWNBSP. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 1 19:57:25 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:33 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990301115652.00c2d770@pop.intergate.bc.ca> At 02:41 PM 3/1/99 -0500, John Cowan wrote: >Tim Bray wrote: > >> So... it seems that in UTF8, >> a ZWNBSP as first character in the file isn't a data character. > >Can you quote chapter and verse for this, either Unicode or 10646? That is *exactly* the question that's now being pursued, and is I gather is in play right now in the IETF (or was that Unicode, I forget which). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From russell at latticesemi.com Mon Mar 1 20:06:20 1999 From: russell at latticesemi.com (Jerry Russell) Date: Mon Jun 7 17:09:33 2004 Subject: Announce: XML directory/search engine Message-ID: There is a new site devoted to sites and documents created in XML. You can now begin submitting your sites. The new site is at: worldwideweave.com -------------------------------------------- Jerry Russell Product Engineer Lattice Semiconductor 408-428-6400 x. 274 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Mon Mar 1 20:09:54 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:09:33 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) Message-ID: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com> Greetings, Here I am speaking for myself, not the HTML Working Group or CNET: > -----Original Message----- > From: Walter Underwood [mailto:wunder@infoseek.com] > Sent: Friday, February 26, 1999 9:42 AM > To: xml-dev@ic.ac.uk > Cc: www-html-editor@w3.org > Subject: Re: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) > The objection about thin clients or palmtops not wanting to download > large files doesn't really hold water. XML will generally be the > smallest files. Mine are almost always smaller than the corresponding > HTML. Powerpoint, PDF, JPEG -- those are big files. This is simply incorrect. The limited capabilities of thin clients and the expense of transmission of the information require capabilities-based analysis and profiling of documents on a per-client basis. As an example, consider a web-enabled cellphone such as this one: http://www.attws.com/business/pocketnet/index.html. The transmission costs to this device vary greatly worldwide, from ~$1/minute in the US to ~$22/min in Nairobi (actually you can only get basic cell phone via satellite in Nairobi, but let's pretend.) If I send a 1/2 megabyte XHTML file to this device, including its 100K CSS stylesheet, the user is entirely justified in bringing legal action against me. The page would cost many tens or hundreds of dollars to send, and of course could not be displayed. In fact the client phone would necessarily display an HTTP error message (or its equivalent) on the tiny screen. Not to mention the costs of transmitting the inevitable ~12k banner ad, which again cannot be displayed. (Information may want to be free but information providers want to get paid.) At this point in time, no method other than MIME types exists for informing the client of the type of content arriving, without first downloading the entire file and then checking it, an obvious absurdity. Doctypes, FPIs, etc. have all be suggested, but none of these solutions provides the necessary level of transaction control required to identify the content prior to content reception. Given the massive costs involved, the client must always be allowed to reject content prior to downloading the entire file. > Adding an XML-specific HTTP header line makes HTTP 1.1 more complex > (shudder), and imposes an extra coding and testing burden on HTTP > implementations. Also, it does nothing for XHTML over other > transports, > like SMTP or FTP. It is also introducing a new set of dependencies for all XML documents. Not feasible. > Essentially, this is document information, not protocol information. > It belongs in the document. To describe the document out-of-line, > use RDF, not HTTP headers. Thin clients will almost necessarily reject all RDF documents (and most XML documents in general). RDF is complex and experimental; I am unconvinced that a cell phone should have to deal with it. > Pragmatically, HTTP Content-type isn't even reliable. Somebody will > decide that Excel and XML are the same thing, and start serving > spreadsheets as text/xml. Cell phones have to deal with that world, > and adding things to the HTTP spec doesn't fix ignorant sysadmins. True; unfortunate; costly for the victims; possibly legally actionable. > XHTML Spec comment: the spec doesn't mention application/xml. > It should. > If application/xml is never appropriate for XHTML (say, the UTF-16 > encoding is forbidden), then say so. The XHTML spec is very clear on this, explicitly stating the MIME types that can be used. No other MIME types are *ever* appropriate. With MIME types being used for document type identification, sending a document with the wrong MIME type guarantees an error. > > XHTML Spec comment: Are the Strict, Transitional, and Frameset DTDs > subsets or extensions? Or neither? Is one a subset of another? These > intentions should be spelled out in the spec so that future versions > won't break them. > The 3 XHTML DTDs are neither subsets or extensions in a literal sense. They correspond as closely as possible to the HTML 4.0 DTDs of the same names. While to some extent the 'strict' DTD is a subset of the other two, it also uses different content models for elements with the same name. Once could not, for practical purposes, use it as an external subset and include the frameset DTD as an internal DTD subset without conflict between their content models. I will not attempt to justify the division of HTML into these 3 groupings - this was decided by the HTML 4.0 committee and is loosely justified by the HTML 4.0 specification. Current attempts are designed to follow this existing prior art to the greatest extent possible. Regards, D- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Mar 1 21:04:56 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions Message-ID: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> before you scream, this isn't about style sheets, and it isn't about attributes vs. elements. rather, this is more how to structure your document/data. any words of wisdom regarding: 1) having an extra collection layer in the xml tree, like vs. > 2) having PCDATA vs. having a distinct "comment" or "description" element child: this is the description of this thing vs. this is the description of this thing -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Mon Mar 1 21:26:47 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions References: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> Message-ID: <36DB05E6.B4CD1C4D@allette.com.au> Mark D. Anderson wrote: > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, like > > vs. > > Would you consider and to be siblings? If so, I wouldn't compartmentalise. Alternatively, if can appear after but these have different significance, I would compartmentalise. > 2) having PCDATA vs. having a distinct "comment" or "description" element child: > > this is the description of this thing > > > vs. > > this is the description of this thing > > If you are going to have a need to deal with in some way and it could get mixed up with other #PCDATA, I'd create an element. My instinct would be to mark it up as an element unless the overhead was excessive, but I think that sort of thing is driven by (a) immediate or forseeable requirements, followed by (b) personal taste. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Mon Mar 1 22:20:52 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:34 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> Message-ID: <36DB1161.893184EE@eng.sun.com> roddey@us.ibm.com wrote: > > Does the following violate the 'partial markup in entity' rule of XML? > > > "> > > %Whole; I'll assume you intended to work with parameter entities; then as Richard pointed out this can be legal ... if the three syntax errors are corrected (" So I'm assuming > that this is ok, that the prohibition against partial markup refers to the > eventual use of the entity, not to the definition thereof? Right -- this would violate _validity_ constraints (but a nonvalidating parser should accept it just fine): "> %Part1;%Part2; ]> Another way to make an error out of your declarations is to make the PEs be external, not internal -- then they'd not match full grammatical productions. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 22:32:12 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:34 2004 Subject: Streaming XML and SAX In-Reply-To: <36DA2858.43F3EA7A@thinlink.com>; from Tom Harding on Sun, Feb 28, 1999 at 09:40:40PM -0800 References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com> Message-ID: <19990302093128.A19583@io.mds.rmit.edu.au> On Sun, Feb 28, 1999 at 09:40:40PM -0800, Tom Harding wrote: > Marcelo Cantos wrote: > > > It has already been pointed out in this discussion that some > > environments try to increase the throughput by dispatching > > documents off to different threads. A system with 50 CPU's is > > going to be operating as low as 2% capacity if it is forced to > > pipe the entire parsing load through a single thread. I don't see > > how you can argue that this is efficient. > > Even if you believe that parsing to convert markup into memory > structures is slower than back-end processing, if parsing is faster > than the stream itself there is no difference in the two approaches. That is an awfully big _if_ to enshrine in a standard (if that's where all this broo-ha-ha ultimately ends up). What if client and server are on the same machine? > Anyway, in the general case the question is moot because there may > be inter-document dependencies, so you have to look inside the > document before trying to parallelize. The question is far from moot since an enormous class of very interesting problems does not fall into this category. There are myriad applications for self-contained XML packets. Furthermore, inter-document dependenies are not a fundamental problem for parallelisation. Threads can talk to each other and block waiting for other threads to finish parsing, while allowing other threads to continue independent tasks. You are suggesting that because in some cases it isn't trivial to parallelise we should therefore never even allow the possibility of such a thing to occur. > The whole point of this discussion was whether the document > terminator ought to be XML or non-XML. Aside from the fact that I > haven't yet seen a workable suggestion for a non-XML terminator, I am frankly incredulous that there are no systems, protocols or standards available today that adequately address the need to stream multiple logical units of information. This is not a new problem. Let me suggest one off the top of my head: send a null terminated decimal length, followed by a document. This is sufficient to dispatch data to multiple threads and raise concurrency levels. Any further processing can be done inside the parsers. > it > isn't necessary to completely examine a document or convert it to a > tree just to find an XML terminator. You can do better than a well-formedness parser? What are you going to do, grep for ? > As Nathan pointed out, you > could write a semi-parser to find terminators and then actually > parse documents in parallel, but you'd need to suggest a way for > dealing with inter-document dependencies. You get the threads to talk. Inter-document dependencies are not and need not be a protocol issue. At the end of the day, the problem of streaming documents is not a difficult one to solve at the protocol level (HTTP-NG will have it built in, AFAIK). Why do you want to complicate life by overloading the parser's job? Actually, my real question is, what on earth do you hope to gain? Or is this just a philosophical preference thing? Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MarkM at SapphireGroup.com Mon Mar 1 23:02:22 1999 From: MarkM at SapphireGroup.com (Mark Murphy) Date: Mon Jun 7 17:09:34 2004 Subject: Looking for XML Filtering Projects Message-ID: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com> At XTech '99, I am delivering a presentation on information filtering applied to XML -- given a source of new/changed XML-encoded data, determining which of a set of people are interested in that XML based on filter criteria. I want to make sure I mention any relevant work in this area, besides my own and other projects I'm already aware of (e.g., XTenit.com, XML-enabled search tools like sgrep). If you are working on information filtering applied to XML, and you would like your project mentioned at XTech '99, please send me an e-mail (MarkM@SapphireGroup.com) with relevant details, and I'll be sure to include you in my presentation! Mark L. Murphy The Sapphire Group, Inc. MarkM@SapphireGroup.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 1 23:33:36 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:34 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com> <19990302093128.A19583@io.mds.rmit.edu.au> Message-ID: <36DB2399.BC94B7E4@thinlink.com> Marcelo Cantos wrote: > Furthermore, inter-document dependenies are not a fundamental problem > for parallelisation. Threads can talk to each other and block waiting > for other threads to finish parsing, while allowing other threads to > continue independent tasks. You are suggesting that because in some > cases it isn't trivial to parallelise we should therefore never even > allow the possibility of such a thing to occur. I was not suggesting that. I merely said that in the general case, knowing how to parallelize requires looking at the data in the stream. I propose that this data, like everything else, be stored in XML and that before doing anything else, the endpoint ought to parse it. I'm sorry if I gave the impression that I think XP is the solution to everything. I merely think it would be useful for a lot of things. If you're judging it on the criteria of being able to accomplish something that was impossible before, I'm not surprised you're disappointed. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Mon Mar 1 23:39:37 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:34 2004 Subject: Looking for XML Filtering Projects In-Reply-To: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com>; from Mark Murphy on Mon, Mar 01, 1999 at 06:02:08PM -0500 References: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com> Message-ID: <19990302103859.B19583@io.mds.rmit.edu.au> On Mon, Mar 01, 1999 at 06:02:08PM -0500, Mark Murphy wrote: > At XTech '99, I am delivering a presentation on information filtering > applied to XML -- given a source of new/changed XML-encoded data, > determining which of a set of people are interested in that XML based on > filter criteria. > > I want to make sure I mention any relevant work in this area, besides my own > and other projects I'm already aware of (e.g., XTenit.com, XML-enabled > search tools like sgrep). Our database server (SIM) has a facility for querying a database at regular intervals. The results are masked with a last-modified filter, which is updated each time the query is issued. This means that users can run a session, build up queries (either by creating new ones, or merging prior result sets with boolean operators) and then save them. They can then have those saved queries executed regularly on any new or changed data and a notification sent to them in an appropriate manner (e.g. an emailed page of abstracts and accompanying links). The beauty of this approach is that is conflates the concept of filter and query. Hence, users wishing to filter documents for items of interest have the full expressive querying power of the database with which to define their peculiar interests. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dalapeyre at mulberrytech.com Mon Mar 1 23:47:28 1999 From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre) Date: Mon Jun 7 17:09:34 2004 Subject: xml style questions In-Reply-To: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com> Message-ID: Mark Anderson wrote: >any words of wisdom regarding: >1) having an extra collection layer in the xml tree, like > >vs. >> If you have ANY reason to think you may need the collection layer, put it in. Reasons you might want it include things like: a) Reuse - s are frequently used together and you want electronic cut-and-paste and/or even a really stupid parsing algorithm to be able to find them all easily. The converse is the same, if you want to ignore all s, group them. b) You need some sort of behavior or formatting at the collection level. This could be as simple as wanting a new indent level in the generated toc. This is the most common reason in practice. c) For correct hierarchical layering, s just aren't as big and important as s so they don't belong at the same level. etc. Yes, much of this could also be done by asking if you are the first among your siblings, etc. But sometimes event-driven processing is easier or faster than tree walking, and a containing element gives you your event. >2) having PCDATA vs. having a distinct "comment" or "description" element >child: >this is the description of this thing >> >vs. >this is the description of this thing > As a style issue, I favor the explicit description. Makes programming life easier all around, costs next to nothing. Programs can easily find the two equivalent, but, in my experience, people don't. --Debbie ====================================================================== Deborah Aleyne Lapeyre mailto:dalapeyre@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9633 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Tue Mar 2 00:58:04 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:34 2004 Subject: HyBrick Support for XPointer Message-ID: <3.0.5.32.19990301165634.00a7f3a0@pophost.fsc.fujitsu.com> Previous announcements of HyBrick's support for XPointer have not detailed which features are supported. One reason of course is that the discussion of XPointer continues within the W3C WG. With the announcement of the most recent version of HyBrick resulting in a significant number of downloads, it looks like a good time to state which features are availble. Based on the March, 1998 XPointer draft, HyBrick users can test: - All absolute loc terms: root(), html(), id(), origin() - All relative loc terms: child(), ancestor(), descendant(), following() preceding(), fsibling(), psibling() - The attr() loc term Quick Intro: psibling Example Here's a quick introduction to using these features: - Go to the Samples\XLink-sample directory - Open the readme.xml file - Inside the first xlink element, under the first locator element: insert: - In the first p element start tag after Overview, add the attribute/value pair id="p6". - Go to the dtd directory and open the sample.dtd file. - Add after the > Call for Participation to one of the workshops of WET ICE > > IEEE 8th International Workshops on Enabling Technologies: > Infrastructure for Collaborative Enterprises. > > 16-18 June 1999 > Stanford University, California USA > > For more information: http://www.ida.liu.se/conferences/WETICE/ > ______________________________________________________ > > WET ICE Workshop on Integrating XML and Distributed Object Technologies > > For more information: http://www.cerc.wvu.edu/workshop2/xmlobjects.html > > Call for Papers and Workshop Description > > The Internet world is being transformed before our eyes as open standards > such as > XML are being rapidly adopted. The XML technologies are being seen as > harbinger of various new functionality in numerous domains ranging from > electronic commerce to electronic publishing to healthcare delivery to > manufacturing to > insurance. Various object-oriented technologies and standards such as Java, > CORBA and DCOM have also progressed rapidly in the past few years. At this > time, > the industry and academia are seriously looking at the intersection of these > technologies and what it means to the future of the object-web paradigm. > This > workshop aims to bring together participants who are seriously investigating > the combined use of these technologies to support practical application > needs > in a variety of domains. The goal of this workshop is to investigate how XML > and Distributed Object technologies such as Java, CORBA and DCOM can be > integrated leveraging the strengths each have to offer. > > Integrating XML and Distributed Object technologies > Advances in XML: DOM, SAX, XSL, Schemas, XLink as it relates to Objects > Advances in CORBA 3.0, Java, DCOM as it relates to XML > Tools and utilities that facilitate integration of XML and > object-technologies > Application of XML and Object technologies in E-commerce, Finance, > Healthcare, > Publishing, Insurance and Manufacturing and System Integration. The > purpose of these examples should be to show specific successful integration > approaches of XML and objects. > > Workshop Chairs: > > V. "Juggy" Jagannathan > Concurrent Engineering Research Center > West Virginia University > P.O. Box 6506 > Morgantown, WV, USA 26506-6506 > Email: juggy@cerc.wvu.edu > > Matthew Fuchs > Veo Systems, Inc. > Email: matt@veosystems.com > > ____________________________________________________________________________ > ______ > > About WET ICE > > WET ICE is an annual, international forum for state-of-the-art research in > enabling > technologies for collaboration. > > WET ICE '99 will consist of parallel, three-day workshops on different > topics related > to collaboration technology. Each workshop will include paper presentations > and working > group discussions, with additional joint keynote sessions and a final joint > session > to summarize each groups' findings. > > What sets WET ICE apart from larger conferences is that the workshops are > kept > small enough to promote fruitful discussions on the > latest technology developments, directions, problems, and requirements. Each > group > will produce a summary report which will appear in the post-proceedings to > be published > by IEEE Computer Society Press. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Tue Mar 2 02:31:41 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:09:34 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp> John Cowan writes: >Tim Bray writes: >> In practice, >> I've never actually seen anything outside of the BMP, but the >> experts agree they're showing up real soon now. > >Not until Unicode 4.0, unless someone wants to use the private-use >planes 15 and 16. It is my understanding that Unicode 3.0 will have many ideographic characters which are outside of the BMP. >John Cowan writes: >Tim Bray writes: >> So... it seems that in UTF8, >> a ZWNBSP as first character in the file isn't a data character. > >Can you quote chapter and verse for this, either Unicode or 10646? >The latter spec tells you that the sequence EF BB BF may be used as >a *signature* at the beginning of UTF-8 data (since it is unlikely >to occur in any other kind), but does not IMHO imply that the >sequence is removable or doesn't represent a real ZWNBSP. Attached is quoted from A2 of N1396 ISO/IEC 10646-1 Corrigendum no. 2 (First draft - revised to 30 April 1996), which was (is?) available at http://osiris.dkuug.dk/JTC1/SC2/WG2/docs/N1396.doc The para most relevant to your question is: >An application receiving data may either use these signatures to >identify the coded representation form, or may ignore them and treat >FEFF as the ZERO WIDTH NO-BREAK SPACE character. How do you interpret this "or"? One could argue that when EF BB BF is recognized as a signature, it is not treated as the ZWNS. Unfortunately, every description about the BOM (even for UCS-2 or UTF-16) is unclear and subject to different interpretations, as I see it. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp --------------------------------------------------------- Annex F (informative) The use of "signatures" to identify UCS This annex describes a convention for the identification of features of the UCS, by the use of "signatures" within data streams of coded characters. The convention makes use of the character ZERO WIDTH NO-BREAK SPACE, and is applied by a certain class of applications. When this convention is used, a signature at the beginning of a stream of coded characters indicates that the characters following are encoded in the UCS-2 or UCS-4 coded representation, and indicates the ordering of the octets within the coded representation of each character (see 6.3). It is typical of the class of applications mentioned above, that some make use of the signatures when receiving data, while others do not. The signatures are therefore designed in a way that makes it easy to ignore them.?In this convention, the ZERO WIDTH NO-BREAK SPACE character has the following significance when it is present at the beginning?of a stream of coded characters: UCS-2 signature: FEFF UCS-4 signature: 0000 FEFF UTF-8 signature: EF BB BF UTF-16 signature: FEFF An application receiving data may either use these signatures to identify the coded representation form, or may ignore them and treat FEFF as the ZERO WIDTH NO-BREAK SPACE character. If an application which uses one of these signatures recognises its coded representation in reverse sequence (e.g. hexadecimal FFFE), the application can identify that the coded representations of the following characters use the opposite octet sequence to the sequence expected, and may take the necessary action to recognise the characters correctly. NOTE - The hexadecimal value FFFE does not correspond to any coded character within ISO/IEC 10646. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Mar 2 03:29:42 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:35 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) In-Reply-To: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com> Message-ID: <001601be645c$24395ae0$d3228018@jabr.ne.mediaone.net> Daniel Austin wrote: > > At this point in time, no method other than MIME types exists for > informing the client of the type of content > arriving, without first downloading the entire file and then > checking it, an > obvious absurdity. Doctypes, FPIs, > etc. have all be suggested, but none of these solutions provides the > necessary level of transaction control required to identify the content > prior to content reception. Given the massive costs involved, the client > must always be allowed to reject content prior to downloading the entire > file. Please explain what: Content-type: text/xhtml can possibly do for you that: Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" cannot do. (Note: the use of doctype = dtd is an example, the doctype can point to any URI. Just like the XML namespace URI, the doctype URI serves as a unique identifier and implies no particular meaning. > > > > > Adding an XML-specific HTTP header line makes HTTP 1.1 more complex > > (shudder), and imposes an extra coding and testing burden on HTTP > > implementations. Also, it does nothing for XHTML over other > > transports, > > like SMTP or FTP. > > > It is also introducing a new set of dependencies for all XML > documents. Not feasible. Huh!? Both these statements are patently false. As per the RFC 822 and following specs, inclusion of a new header does not in any way alter the syntax of HTTP or SMTP. It is specifically allowed. Both SMTP and HTTP can deal with headers, FTP of course could care less about text/xhtml or any other MIME header so this is moot. The point is to create a generalizable mechanism for content negotiation depending on an XML namespace or DTD or Schema. XHTML like HTML 1.0 - HTML 4.0 is a soon to be historical oddity. I have nothing against HTML, just why create a hack to solve a particular problem for XHTML version 1.0 e.g. text/xhtml, when a generalizable solution can be created for any XML document type e.g. text/xml; doctype=".../XHTML10.dtd". This gives the best of both worlds. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 2 08:49:06 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:35 2004 Subject: xml style questions Message-ID: <01BE6490.F0D17680@grappa.ito.tu-darmstadt.de> Mark D. Anderson wrote: > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, like > > vs. > > Another reason for a collection layer is human readability. This is especially important if the document is normally edited/read by humans, less so if it is designed only to be written/read by machine. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stefan at objectfarm.org Tue Mar 2 11:45:24 1999 From: stefan at objectfarm.org (Stefan Kreutter) Date: Mon Jun 7 17:09:35 2004 Subject: XPointer question Message-ID: Hello there! given the following XML-snippet: Bart Simpson Homer Simpson can I use th following XPointer to get the customer ID of Bart Simpson: root().child(all, customer).child(1,name).string(1, "Bart Simpson").ancestor(1, customer).attr(id) I guess this sould work since the XPointer grammar allows to place OtherTerm after a StringTerm, but I'm not sure if I understood the spec completely. Since string() might return portions of multiple nodes (see 3.7 of WD-xptr-19980202) applying ancestor() seems a little problematic. BTW is there a typo in the XPtr-spec? In grammar rule [2] it says: [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerm shouldn't that be: [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerms this would allow XPointers of any length not just one or two OtherTerms. -Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1026 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/b5d13faa/attachment.bin From msabin at cromwellmedia.co.uk Tue Mar 2 12:07:28 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:35 2004 Subject: Encoding detection again ... Message-ID: I've been browsing throught the archives for an answer to this question, but I haven't been able to find anything that seems to give a completely unambiguous answer ... Appendix F of the spec say that given a document starting with the 4 octet sequence, 00 3C 00 3F I'm to infer BOM-less big-endian UTF-16, and given a document starting with, 3C 00 3F 00 I'm to infer BOM-less little-endian UTF-16. What I what to know is: why could these sequences not equally represent (respectively) big-endian UCS-2 or little-endian UCS-2? In other words, surely these octet sequences are ambiguous, and hence the encoding should be resolved definitively with either, or, or an appropriate MIME header, ie., Content-type: text/xml; charset="utf-16" or, Content-type: text/xml; charset="ISO-10646-UCS-2" Just so there's no confusion ... I'm assuming: 1. Unicode == UTF-16 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's support for characters outside the BMP). -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Mar 2 13:41:05 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:35 2004 Subject: xml style questions Message-ID: <93CB64052F94D211BC5D0010A80013310EB351@wwmessd3.bra01.icl.co.uk> > > any words of wisdom regarding: > > 1) having an extra collection layer in the xml tree, > 2) having PCDATA vs. having a distinct "comment" or > "description" element child: Firstly, the extra markup can be used to impose extra validity constraints, which means you application has to do less checking. Secondly, the extra markup can make XSL stylesheets a lot easier to write. (In fact, without it they can be impossible...) So if you're auto-generating the XML and if space isn't at a premium I would include the extra tags. If it's manually edited it's a different story... Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 2 14:39:40 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:35 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com> Message-ID: <36DBF7DE.27FAFCD0@locke.ccil.org> David Brownell wrote: > Right -- this would violate _validity_ constraints (but a nonvalidating > parser should accept it just fine): > > > "> > > %Part1;%Part2; > ]> > > It's not 100% clear to me whether the reference to Part2 violates the WFC "PEs in Internal Subset", which states (inter alia) that "parameter-entity references can occur only where markup declarations can occur". After "%Part1;" which resolves to " Message-ID: <36DC012D.FAA63A78@locke.ccil.org> Jonathan Borden wrote: > Please explain what: > > Content-type: text/xhtml > > can possibly do for you that: > > Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" > > cannot do. (Note: the use of doctype = dtd is an example, the doctype can > point to any URI. Just like the XML namespace URI, the doctype URI serves as > a unique identifier and implies no particular meaning. I agree, except that I would prefer to see an FPI rather than (or in addition to) a URI. That would be extensible to HTML as well as XHTML, and therefore to the text/html media type as well as the text/xml media type. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 2 15:40:27 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:35 2004 Subject: XML and special Characters : unicode v3.0 ? References: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp> Message-ID: <36DC062C.73214454@locke.ccil.org> MURATA Makoto wrote: > It is my understanding that Unicode 3.0 will have many ideographic > characters which are outside of the BMP. The Unicode Consortium has indicated on its mailing list that no non-BMP characters will appear in Unicode 3.0. (Unless Vertical Extension A is being put in Plane 2 after all?) > >An application receiving data may either use these signatures to > >identify the coded representation form, or may ignore them and treat > >FEFF as the ZERO WIDTH NO-BREAK SPACE character. > How do you interpret this "or"? I interpret it as "inclusive or", "and/or", "vel". > One could argue that when EF BB BF > is recognized as a signature, it is not treated as the ZWNS. I think that it may or may not be treated as the ZWNBSP. In any event, the whole annex is informative, and describes "a convention [...] applied by a certain class of applications". It is reasonable to suppose that XML is not in that class of applications, at least so far as UTF-8 recognition is concerned. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ajd100 at NAmerica.mot.com Tue Mar 2 16:54:50 1999 From: ajd100 at NAmerica.mot.com (Dutra Juliana-AJD100) Date: Mon Jun 7 17:09:35 2004 Subject: FW: Voice XML Message-ID: <11EF19296147D211A7C100805F312AE7C027A0@s-il06ar.corp.mot.com> fyi... > Chiming in on voice standards: AT&T, Lucent Technologies and Motorola will > announce today joint cooperation on a software language that allows users > to access the Internet by voice The companies are hoping that the > language, called VXML, which stands for voice extensible markup language, > will become a standard for voice commands to the Internet. > > http://www.msnbc.com/news/245787.asp > > Juliana Dutra - E-Business Strategies > ===================================== > Motorola, Communications Enterprise, MMS > Loc: IL06, Phone = 847-538-3101 Fax = 847-538-7791 > Intranet = http://mms.mot.com/ebusiness/ > ===================================== > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kyu-hwang.yeon at bauer-partner.de Tue Mar 2 18:08:26 1999 From: kyu-hwang.yeon at bauer-partner.de (Kyu Hwang Yeon) Date: Mon Jun 7 17:09:35 2004 Subject: I wonder ... Message-ID: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> Hi I am looking for a way to reuse *.dtd files. For example, I have book.dtd and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd without rewriting whole library.dtd. (Maybe it is too silly question for people who subscribe this new group) I wonder it is possible? Otherwise, should certain conditions be satisfied for that reuse? Best regards, Kyu Hwang xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Tue Mar 2 19:18:38 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:09:35 2004 Subject: XML behind XMLBars Message-ID: <9C998CDFE027D211B61300A0C9CF9AB442470A@SNYC11309> Hi everybody, Let me present to the community XMLBars: XML driven menu bars. I intended it to serve as a simple and visually perceivable example of using XML to facilitate web design issues. Seems that it turned into a nice web GUI tool. I would highly appreciate your judgment and critique. Your contribution is very welcome. Here is my sin: Namespaces are used to point to document fragments collection (rather then element definitions) Why not? It is more convenient for me to say then to use XPointer or Entity: It is easier for people to read (not only parsers matter). By changing namespace ( URN ) all the references defined with its alias will change automatically. It is also great for internalization. And, of course, I can define multiple namespaces of fragments. The fact that URN doesn't have to be a real URL makes the possibilities even greater. ----------------- XML behind XMLBars Menu Markup Language, if I may :) Menu bar rendering and formatting information is stored in XML and cashed in DOM by a parser. Submenus are rendered only when parent menu is activated. Action to be fired on a menu click event is also stored in XML. Action can be a Link to a web page or a chunk of JavaScript code. It can also be a Sub-Action. In this case child menu inherits parents action. Action can be parameterized. For example in the following fragment xml-dev archive http://www.lists.ic.ac.uk/hypermail/xml-dev//index.html 1999 99 January 01< /SUB> February 02 two leaf submenus when clicked will point to: http://www.lists.ic.ac.uk/hypermail/xml-dev/9901/index.html and http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/index.html Most of magazines and monthly publications have similar structure. Reusable group of 12 submenu -months will help. The 3 years of XML-DEV archive will be as short as: xml-dev archive http://www.lists.ic.ac.uk/hypermail/xml-dev//index.html 1999 99 1998 98 1997 97 The second optional attribute xql:select will filter first 4 months for current year and months starting with February for the year 1997. XMLBars implemented using IE5beta parser can be found at http://www.cogx.com/XMLBar. (Sorry, still working on cross-browser implementation). Nikita Ogievetsky Cogitech Inc. http://www.cogx.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Tue Mar 2 19:39:08 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:09:35 2004 Subject: I wonder ... Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> >I'd like to reuse book.dtd inside library.dtd without rewriting >whole library.dtd. (First, this is really a question for the xml-l list or comp.text.xml. xml-dev is for people developing XML software.) This is what external parameter entities are for. Parameter entities store pieces of a DTD, and "external" means "stored in a separate file" (or the equivalent construct in your operation system). For example, if book.dtd is the following: your library.dtd file could look like this: %bookdtd; Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Mar 2 20:12:31 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:36 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) Message-ID: <008201be64e8$0fc8ca00$0b2e249b@fileroom.Synapse> John Cowan wrote: >Jonathan Borden wrote: > >> Please explain what: >> >> Content-type: text/xhtml >> >> can possibly do for you that: >> >> Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd" >> >> cannot do. (Note: the use of doctype = dtd is an example, the doctype can >> point to any URI. Just like the XML namespace URI, the doctype URI serves as >> a unique identifier and implies no particular meaning. > >I agree, except that I would prefer to see an FPI rather than (or >in addition to) a URI. That would be extensible to HTML as well as >XHTML, and therefore to the text/html media type as well as the >text/xml media type. > This is a good idea. A general way to employ the Content-type header to specify a document type is: Content-type: text/xml; element="html"; fpi="-//W3C//DTD XTHML 1.0 Strict//EN"; uri="http://www.w3.org/XHTML.DTD" This should apply to text/html, text/xml, text/sgml, application/xml etc. deja vu all over again :-) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Mar 2 20:22:01 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:36 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com> <36DBF7DE.27FAFCD0@locke.ccil.org> Message-ID: <36DC4828.C133D8F2@goon.stg.brown.edu> John Cowan wrote: > > Right -- this would violate _validity_ constraints (but a nonvalidating > > parser should accept it just fine): > > > > > > > "> > > > > %Part1;%Part2; > > ]> > > > > > > It's not 100% clear to me whether the reference to Part2 violates > the WFC "PEs in Internal Subset" To restate your message slightly: The problem with %Part2; is that the markup unit starts with %Part1; and ends with %Part2;, which is something parsed entities aren't supposed to do. Note: The only reason you can get away with is that section 4.3.2 of the XML 1.0 standard says that all internal parsed entities are by definition well formed. It's apparently an exception to the "proper nesting" rule, meant spe- cifically to allow cutting and pasting of parameter entities. This is also the motivation for suppressing the addition of spaces before and after the entities inside the quotation marks above. Would anyone agree that the standard is not altogether clear on this point? (Tim, if my comments are correct, it might make sense to edit them, in some form, into your annotated version of the spec.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From slotter at maya.com Tue Mar 2 20:40:55 1999 From: slotter at maya.com (Dave Slotter) Date: Mon Jun 7 17:09:36 2004 Subject: Expat API In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> Message-ID: Hi. I'm new to this list (just subscribed today) and searched the archives on expat, but it failed to answer my question. My question is: where is the documentation on how to use the expat API? I downloaded version 1.0.2 and ported the code to run the sample program on my Macintosh, but I'm pretty much dead in the water. I tried sending email to the author (James Clark) twice in the last few days, but I have so far failed to receive a response. The comments in the header files do not seem to be sufficient. What I am trying to do is parse some well-formed XML such as the following example so that I can get the tags (which the example shows me how to do) and then obtain the text. ----- cat gray ----- For example, I would like to be able to obtain the TAG as well as the FOO ID (12345678), then the tag along with the enclosed text (cat) then the tag along with its enclosed text (gray). However, the sample program only shows how to retrieve the tags. If anyone has some example code, I would be grateful. If someone has documentation, that would be appreciated as well. -Dave Slotter xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 2 21:10:33 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> References: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de> Message-ID: <14044.21265.752204.753493@localhost.localdomain> Kyu Hwang Yeon writes: > I am looking for a way to reuse *.dtd files. For example, I have book.dtd > and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd > without rewriting whole library.dtd. (Maybe it is too silly question for > people who subscribe this new group) I wonder it is possible? Otherwise, > should certain conditions be satisfied for that reuse? Try this: %book; All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Tue Mar 2 21:22:34 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... References: Message-ID: <36DC55FF.64C4408D@Eng.Sun.COM> Miles Sabin wrote: > > Appendix F of the spec say that given a document > starting with the 4 octet sequence, > > 00 3C 00 3F > > I'm to infer BOM-less big-endian UTF-16, and > given a document starting with, > > 3C 00 3F 00 > > I'm to infer BOM-less little-endian UTF-16. That is, the appendix _suggests_ (in a non-normative fashion) that's the way to go. > What I what to know is: why could these > sequences not equally represent (respectively) > big-endian UCS-2 or little-endian UCS-2? They could ... > > 1. Unicode == UTF-16 > 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's > support for characters outside the BMP). Put it this way: if you assume UTF-16, you're safe either way because UTF-16 is a superset. It'd be reasonable for an autodetecting algorithm to support "downgrading" its guess from UTF-16 to UCS-2, and should probably do so if it's reporting encoding mismatches as fatal errors. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Mar 2 21:48:18 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <14044.21265.752204.753493@localhost.localdomain> Message-ID: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> This works fine, but (at least in IE 5) only for a single level. That is, you can't have another entity reference inside "book.dtd". To me, this significantly limits its usefulness (imagine not allowing a #include inside a file that was #included). Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of David Megginson Sent: Tuesday, March 02, 1999 1:09 PM To: XML Development Subject: re: I wonder ... Kyu Hwang Yeon writes: > I am looking for a way to reuse *.dtd files. For example, I have book.dtd > and library.dtd. Then, I'd like to reuse book.dtd inside library.dtd > without rewriting whole library.dtd. (Maybe it is too silly question for > people who subscribe this new group) I wonder it is possible? Otherwise, > should certain conditions be satisfied for that reuse? Try this: %book; All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 2 21:51:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:36 2004 Subject: I wonder ... In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> References: <14044.21265.752204.753493@localhost.localdomain> <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> Message-ID: <14044.23685.951735.30695@localhost.localdomain> Jeffrey E. Sussna writes: > This works fine, but (at least in IE 5) only for a single > level. That is, you can't have another entity reference inside > "book.dtd". To me, this significantly limits its usefulness > (imagine not allowing a #include inside a file that was #included). If IE 5 behaves this way, it is because of a bug, not because of a limitation in the XML spec -- since XML support in IE is in early days, I expect that Microsoft will fix this problem before the official release. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Clark.Cooper at corporate.ge.com Tue Mar 2 23:04:11 1999 From: Clark.Cooper at corporate.ge.com (Cooper, Clark (CORP, Consultant)) Date: Mon Jun 7 17:09:36 2004 Subject: Expat API Message-ID: <014CB98EB81ED011B3E900805FE2D47A04F74B42@X01SCHCORPGE> Dave Slotter wrote: > My question is: where is the documentation on how to use the expat > API? I downloaded version 1.0.2 and ported the code to run the sample > program on my Macintosh, but I'm pretty much dead in the water As far as I know the include file is the documentation. Expat is used by the perl module XML::Parser, which I maintain, but if you're having trouble with just the include file, you'd be absolutely lost looking at Expat.xs (I get lost looking at it sometimes). If you can use perl, I'd like to suggest XML::Parser as a kindler, gentler interface to expat. If you're not a perl kinda fella, here's a small example of using expat: #include "xmlparse.h" #include #include #define MAXLEV 512 #define BUFSIZE 4096 char indent[(MAXLEV + 1) * 2]; int level = 0; void start(void *data, const XML_Char *name, const XML_Char **atts) { int offset; printf("\n%s> %s", indent, name); while (*atts) { printf(" %s='%s'", atts[0], atts[1]); atts += 2; } if (level >= MAXLEV) { fprintf(stderr, "Exceeded max level\n"); exit(-1); } offset = level * 2; indent[offset] = ' '; indent[offset + 1] = ' '; indent[offset + 2] = '\0'; level++; } /* End start handler */ void end(void *data, const XML_Char *name) { level--; indent[level*2] = '\0'; printf("\n%s< %s\n", indent, name); } /* End end handler */ void text(void *data, const XML_Char *txt, int len) { int i; printf("%s- ", indent); for (i = 0; i < len; i++) putchar(txt[i]); } /* End text handler */ void main(int argc, char **argv) { XML_Parser prs; int stat; FILE * doc; if (argc < 2) { fprintf(stderr, "No filename supplied\n"); exit(-1); } doc = fopen(argv[1], "r"); if (! doc) { fprintf(stderr, "Couldn't open %s\n", argv[1]); exit(-1); } indent[0] = '\0'; prs = XML_ParserCreate(NULL); XML_SetElementHandler(prs, start, end); XML_SetCharacterDataHandler(prs, text); while (! feof(doc)) { int cnt; void *buff = XML_GetBuffer(prs, BUFSIZE); if (! buff) { fprintf(stderr, "Ran out of memory\n"); exit(-1); } cnt = fread(buff, 1, BUFSIZE, doc); stat = XML_ParseBuffer(prs, cnt, 0); if (! stat) { fprintf(stderr, "Parse error at line %d, column %d\n", XML_GetCurrentLineNumber(prs), XML_GetCurrentColumnNumber(prs)); exit(-1); } } fclose(doc); stat = XML_ParseBuffer(prs, 0, 1); if (! stat) { fprintf(stderr, "Parse error at line %d, column %d\n", XML_GetCurrentLineNumber(prs), XML_GetCurrentColumnNumber(prs)); exit(-1); } } /* End main */ -- Clark Cooper Logic Technologies,Inc cccooper@ltionline.com (518) 388-7451 650 Franklin St., Suite 304 coopercc@netheaven.com Schenectady, NY 12305 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Tue Mar 2 23:29:42 1999 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:09:36 2004 Subject: XML and special Characters : unicode v3.0 ? In-Reply-To: <36DAE5FA.5BA2D70E@locke.ccil.org> Message-ID: <000d01be64fc$1a3a09e0$0dce6ccb@baden> Tim Bray writes: > > In practice, > > I've never actually seen anything outside of the BMP, but the > > experts agree they're showing up real soon now. John Cowan writes: > Not until Unicode 4.0, unless someone wants to use the private-use > planes 15 and 16. Uh, that's gonna be a problem. How would you put in a PUA character in an XML doc ? Still by the U+... ? (we have around 800 of them for the languages we work with !!) Baden xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From falk at icon.at Tue Mar 2 23:34:08 1999 From: falk at icon.at (Falk, Alexander) Date: Mon Jun 7 17:09:36 2004 Subject: Please send non-English XML example documents Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: Falk, Alexander.vcf Type: application/octet-stream Size: 1062 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/dff30928/FalkAlexander.obj From msabin at cromwellmedia.co.uk Wed Mar 3 12:12:30 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... Message-ID: David Brownell wrote, > Put it this way: if you assume UTF-16, you're > safe either way because UTF-16 is a superset. Err ... is that true? Maybe I'm being a bit obsessive about my interpretation of the various standards docs, but as far as I can see UCS-2 isn't a subset of UTF-16. The BMP S-zone codes (D800-DFFF) are undefined but reserved in UCS-2, and so should not occur in a purportedly UCS-2 stream. I would expect a processor which encountered such codes to either, 1. Spit out an error and give up. or, 2. Quietly ignore them and continue processing with the next 2 octets. Obviously these codes are defined and legal in UTF-16, so an incorrect assumption of UTF-16 when the stream was in fact broken UCS-2 would produce unpredictably incorrect behaviour (ie. the processor might continue processing a broken doc in an indeterminate way). In any case, on a less finickety note, I'd quite like to be able to compute string lengths UCS-2 style where that's appropriate, because 2*byte- length is a bit simpler than the UTF-16 equivalent ;-) Anyway, here's a slightly updated version of a proposal I mailed to Tim Bray yesterday ... In the absence of an appropriate MIME header the octet sequences, 1. FE FF 2. FF FE 3. 00 3C 00 3F 4. 3C 00 3F 00 may be inferred to be, 1. big-endian indeterminately encoded 2 octet characters. 2. little-endian indeterminately encoded 2 octet characters. 3. BOM-less big-endian indeterminately encoded 2 octet characters. 4. BOM-less little-endian indeterminately encoded 2 octet characters. If either of the following PIs are found, or, in cases (1) and (2), if *no* PI is found, then encoding is resolved to UTF-16. Otherwise if, is found then encoding is resolved to UCS-2. This very complicated and isn't a zillion miles away from the current handling of UTF-8 vs. ISO 8859-x vs. US-ASCII. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Mar 3 12:45:45 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:36 2004 Subject: Encoding detection again ... Message-ID: Sorry to follow up my own posting, but one thing needs a bit of clarification, and one typo needs correction. I wrote, > David Brownell wrote, > > Put it this way: if you assume UTF-16, you're > > safe either way because UTF-16 is a superset. > > Err ... is that true? > > Maybe I'm being a bit obsessive about my > interpretation of the various standards docs, but > as far as I can see UCS-2 isn't a subset of > UTF-16. The question of UCS-2 being, or not being a subset of UTF-16 is a bit of a red herring. It is undoubtedly true that the set of octet pairs which are legal UCS-2 characters is a subset of the set of octet pairs which are legal UTF-16 characters. Appendix F suggests that octet sequences which could equally well be interpreted as UTF-16 or UCS-2 may be assumed to be UTF-16, and *doesn't* include a clause stating that this assumption should be revised in the light of an explicit XML encoding declaration. I think that clause should be added, in much the same way as it is for UTF-8 vs. 8859-X. Now the typo ... > This very complicated and isn't a zillion miles away > from the current handling of UTF-8 vs. ISO 8859-x > vs. US-ASCII. Please insert the word 'isn't' in the obvious place ;-) Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 3 13:30:59 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:36 2004 Subject: SAX and DTDHandler Message-ID: <9f7499ae.36dd3931@aol.com> Hi Everyone, I've been playing around with SAX and several of the parser implementations (primarily Sun's and IBM's). The basics of DocumentHandler and ErrorHandler are straight forward and work well. The interfaces EntityResolver and DTDHandler are still fuzzy. I've searched for documents on these but have not found anything of any depth. My primary question is will SAX allow me to parse a DTD? It doesn't seem so. DTDHandler only handles unparsed Entity declarations (like binary data) and Notation declarations. If it is the case that SAX does not parse DTDs due to the fact that it does not want to perform validation then why bother with the above two cases? I guess I don't understand the design philosophy in these respects. All help is appreciated. Thanks, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Mar 3 13:33:19 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:36 2004 Subject: DTD for Bibliographic Notation In-Reply-To: Message-ID: Has anybody written a DTD for bibliographies? Are there any standards efforts in this area? To be usable, this DTD would have to be public domain or explicitly allow unrestricted reuse. I probably don't need to modify it, but at a minimum I need to be able to republish it. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Wed Mar 3 14:00:44 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Sun, 28 Feb 1999 22:02:37 +0100. <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> Message-ID: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> --------- > There's a very nice document at: > > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm > > Cheers, > Jarle Stabell Thanks, Jarle! Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd (copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6 (Solaris 2). I already parsed xml instances without problems but in this case it doesn't work. Following are the first lines of nsgmls output: sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:19:E: "X201C" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:17:W: named character reference /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:19:E: "X201D" is not a function name /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:101:9:E: document type does not allow element "ABSTRACT" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:142:8:E: document type does not allow element "PUBSTMT" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:146:11:E: document type does not allow element "SOURCEDESC" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:149:10:E: document type does not allow element "LANGUSAGE" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:153:13:E: document type does not allow element "REVISIONDESC" here /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:189:36:W: character "<" is the first character of a delimiter but occurred as data /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:371:8:E: end tag for "HEADER" which is not finished /tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:787:7:W: character "<" is the first character of a delimiter but occurred as data I enabled XML support as described on http://www.jclark.com/sp/xml.htm Set the SP_CHARSET_FIXED environment variable to YES. Set the SP_ENCODING environment variable to XML. Set the SGML_CATALOG_FILES environment variable to point to the file pubtext/xml.soc. Use the -wxml option. setenv SP_CHARSET_FIXED YES What's wrong? Any help is appreciated. Thanks in advance. Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 15:10:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: I wonder ... References: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> Message-ID: <36DD50B5.5904B0A6@locke.ccil.org> Jeffrey E. Sussna wrote: > This works fine, but (at least in IE 5) only for a single level. That > is, you can't have another entity reference inside "book.dtd". To me, > this significantly limits its usefulness (imagine not allowing a > #include inside a file that was #included). If so, that is a dreadful bug. The XML specification has no such limitations, although one might suppose that an implementation might have a practical limit in the neighborhood of 50-100, because of operating system limits on open files. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 15:17:20 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000d01be64fc$1a3a09e0$0dce6ccb@baden> Message-ID: <36DD523B.F2EAFB7E@locke.ccil.org> Baden Hughes wrote: > Uh, that's gonna be a problem. How would you put in a PUA character in an > XML doc ? Still by the U+... ? (we have around 800 of them for the languages > we work with !!) Well, first of all there are 6400 private-use characters on the BMP, so that gives you plenty of room to play with. You cannot use any kind of private-use character in element or attribute names, which is good for interoperability; to incorporate them in character data or attribute values, use a character reference like . What will be more serious is that *normative* characters from the Astral Planes aren't usable in XML names either. Presumably, when they actually show up, XML will be modified, so that we can have element names in Egyptian hieroglyphics with attributes in Sindarin. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Mar 3 15:18:33 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> > Elliotte Rusty Harold writes: >Has anybody written a DTD for bibliographies? Have you looked at the bibliography module of DocBook? DocBook home page: http://www.oasis-open.org/docbook XML version of DocBook: http://www.nwalsh.com/docbook/xml file with XML's bibliography module: http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From prb at uic.edu Wed Mar 3 15:52:40 1999 From: prb at uic.edu (Paul R. Brown) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation Message-ID: <003701be658b$81c84d80$e7b2c183@razzmatazz.math.uic.edu> The folks who built bibtex have already spent some time on this, so you could use portions of their design. - Paul -----Original Message----- From: Elliotte Rusty Harold Date: Wednesday, March 03, 1999 9:25 AM Subject: DTD for Bibliographic Notation >Has anybody written a DTD for bibliographies? Are there any standards >efforts in this area? To be usable, this DTD would have to be public >domain or explicitly allow unrestricted reuse. I probably don't need to >modify it, but at a minimum I need to be able to republish it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Mar 3 16:19:39 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> Message-ID: At 10:24 AM -0500 3/3/99, DuCharme, Robert wrote: >> Elliotte Rusty Harold writes: >>Has anybody written a DTD for bibliographies? > >Have you looked at the bibliography module of DocBook? > No, but I'll check it out. Thanks. > DocBook home page: http://www.oasis-open.org/docbook > XML version of DocBook: http://www.nwalsh.com/docbook/xml > file with XML's bibliography module: >http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod > >Bob DuCharme www.snee.com/bob snee.com> see www.snee.com/bob/xmlann for "XML: >The Annotated Specification" from Prentice Hall. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 3 16:20:22 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:37 2004 Subject: SAX and DTDHandler In-Reply-To: <9f7499ae.36dd3931@aol.com> References: <9f7499ae.36dd3931@aol.com> Message-ID: <14045.24597.828439.227541@localhost.localdomain> MikeDacon@aol.com writes: > My primary question is will SAX allow me to parse a DTD? It > doesn't seem so. DTDHandler only handles unparsed Entity > declarations (like binary data) and Notation declarations. If it > is the case that SAX does not parse DTDs due to the fact that it > does not want to perform validation then why bother with the above > two cases? SAX doesn't parse anything -- it's just an interface. Some (most?) Java-based XML parsers that implement the SAX interface do happen to perform validation, but that's outside the scope of SAX 1.0 itself (we're talking about fixing that for ModSAX). SAX 1.0 provides the DTDHandler interface because XML 1.0 requires processors to report notations and unparsed entities. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 16:28:16 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: SAX and DTDHandler References: <9f7499ae.36dd3931@aol.com> Message-ID: <36DD62E3.785602E1@locke.ccil.org> MikeDacon@aol.com wrote: > My primary question is will SAX allow me to parse a DTD? > It doesn't seem so. DTDHandler only handles unparsed Entity declarations > (like binary data) and Notation declarations. If it is the case that SAX does > not > parse DTDs due to the fact that it does not want to perform validation then > why bother with the above two cases? Remember that SAX is a front-end to various parsers with various philosophies, validating (XML4J), non-validating but external-entity- reading (Aelfred), non-validating and document-entity-only (XP). SAX provides methods, for parsers that wish to do so, to report on declared notations and unparsed entities, since these features provide actual extensions to the basic element/attribute model. Element and attribute list declarations cannot be reported through SAX, since they are reckoned inessential. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 16:29:57 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd References: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> Message-ID: <36DD6335.4C40B6F5@locke.ccil.org> Stefan Mintert wrote: > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > (copied from the above URL) with nsgmls. As the documentation for XMLspec warns, the current version of the DTD is *not* the one used with the XML Recommendation, which used a much older version. So don't do that. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Wed Mar 3 17:14:19 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Wed, 03 Mar 1999 11:28:37 -0500. <36DD6335.4C40B6F5@locke.ccil.org> Message-ID: <199903031713.SAA24548@brown.informatik.uni-dortmund.de> > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > > (copied from the above URL) with nsgmls. > > As the documentation for XMLspec warns, the current version of the > DTD is *not* the one used with the XML Recommendation, which used > a much older version. So don't do that. ooops, sorry; but that doesn't explain the parsing errors concerning the DTD: spec.dtd:60:17:W: named character reference spec.dtd:60:19:E: "X2014" is not a function name [...] BTW: I would be nice to use the XML spec as a valid document, not just a well-formed document. Has anyone kept the old XMLspec DTD? (I guess it's Revision 1.0, 7 April 1998) Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Wed Mar 3 17:58:23 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:37 2004 Subject: DTD for Bibliographic Notation In-Reply-To: References: Message-ID: <3.0.5.32.19990303094553.0097e990@library.berkeley.edu> At 08:26 AM 3/3/1999 -0500, Elliotte Rusty Harold wrote: >Has anybody written a DTD for bibliographies? Are there any standards >efforts in this area? To be usable, this DTD would have to be public >domain or explicitly allow unrestricted reuse. I probably don't need to >modify it, but at a minimum I need to be able to republish it. > Mm, not to be Clinton-esque or anything, but it depends on what you mean by bibliographies. There are an awful lot of DTDs that include elements for bibliographic citation as part of a larger document structure. Some of the better known examples would include the and elements with the TEI DTD, the element within ETD-ML DTD (part of the Electronic Thesis and Dissertation project at Virginia Tech), the element with the Encoded Archival Description DTD, and the element in DocBook. There are standalone DTDs for capturing bibliographic information, but they tend to be written by library geeks like me, and as a result, tend to be a bit more detailed and extensive (read arcane and opaque) than what most people would think of when designing a DTD for bibliographies. The most authoritative work in these lines would probably be the MARC DTDs provided by the Library of Congress (http://lcweb.loc.gov/marc/marcsgml.html), but understanding those without copies of both the USMARC standard and the Anglo-American Cataloguing Rules next to you is a non-trivial task. If you want to look over a simpler version of the MARC standard as an XML DTD, I revised an SGML DTD that I did for MARC which you can grab at http://sunsite.berkeley.edu/~jmcdonou/USMARC.XML.DTD; again, knowledge of the MARC standard is a big help on making heads or tails of the DTD, but , , , and comprise most of what people think of as basic bibliographic information. If you're thinking that having all these different ways of encoding bibliographic information is a headache waiting for those wanting to automate processing of bibliographic data from multiple sources, you're right. But I don't think there's any way out of that one. The needs of those doing markup of bibliographic information vary quite a bit depending on whether we're talking scholars reporting on their research, librarians, publishers, students at various levels, etc. Mapping between multiple forms of marked up bibliographic data is something we're just going to have to live with. I try to think of it as yet another clause in the text-encoding programmers' full employment act. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Wed Mar 3 18:07:59 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:09:37 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Stefan Mintert's message of Wed, 03 Mar 1999 18:13:45 +0100 Message-ID: <199903031807.SAA03605@stevenson.cogsci.ed.ac.uk> > spec.dtd:60:17:W: named character reference > spec.dtd:60:19:E: "X2014" is not a function name Looks like it's not recognising XML-style character references - presumably the line is Are you using a version of nsgmls that knows about XML? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Wed Mar 3 18:12:25 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:37 2004 Subject: Publication of first WD of the W3C XML Fragment Interchange Rec Message-ID: <3.0.32.19990303121005.00decde8@pophost.arbortext.com> The W3C XML Fragment WG [1] has just published its first Working Draft of the XML Fragment Interchange Recommendation [2]. Its abstract reads: The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient. The XML Fragment WG is chartered with defining a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. This document defines Version 1.0 of the [eventual] W3C Recommendation that addresses this issue. Interested parties are invited to review the specification and report implementation experience. As indicated in the document, comments should be sent to [3], (a publicly archived [4] list). Comments received by 1999 March 26 will be considered for a revision soon after. All comments will be considered in light of the XML Fragment Requirements Document [5]. In particular, basic scope issues and design decisions will be reconsidered only when grave and previously unrecognized flaws are uncovered. Requests for enhancement will typically be deferred for later versions of the specification under development unless the enhancement is uncontroversial and its incorporation would not materially delay production of the specification. Paul Grosso XML Fragment WG Chair Daniel Veillard W3C Staff Contact [1] http://www.w3.org/XML/Activity.html#fragment-wg [2] http://www.w3.org/TR/WD-xml-fragment [3] mailto:www-xml-fragment-comments@w3.org [4] http://lists.w3.org/Archives/Public/www-xml-fragment-comments/ [5] http://www.w3.org/TR/NOTE-XML-FRAG-REQ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Wed Mar 3 18:30:15 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd Message-ID: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com> At 18:13 1999 03 03 +0100, Stefan Mintert wrote: > > > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm > > > > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > > > (copied from the above URL) with nsgmls. > > > > As the documentation for XMLspec warns, the current version of the > > DTD is *not* the one used with the XML Recommendation, which used > > a much older version. So don't do that. > >ooops, sorry; but that doesn't explain the parsing errors concerning the DTD: > >spec.dtd:60:17:W: named character reference >spec.dtd:60:19:E: "X2014" is not a function name >[...] 1. nsgmls is not an XML parser. those errors are probably because it's not recognizing — (the hex version) as a numeric character reference. You might try converting X2014 to a decimal number and seeing what happens. Or, use an XML parser. 2. The URL quoted above is old. The latest are: DTD: http://www.w3.org/XML/1998/06/xmlspec-19990205.dtd Documentation: http://www.w3.org/XML/1998/06/xmlspec-report-19990205.htm paul xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tgraham at mulberrytech.com Wed Mar 3 19:31:20 1999 From: tgraham at mulberrytech.com (Tony Graham) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: <199903031400.PAA23631@brown.informatik.uni-dortmund.de> References: <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> <199903031400.PAA23631@brown.informatik.uni-dortmund.de> Message-ID: At 3 Mar 1999 15:00 +0100, Stefan Mintert wrote: > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd > (copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6 > (Solaris 2). I already parsed xml instances without problems but in this case > it doesn't work. Following are the first lines of nsgmls output: > > sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls reads the xml.soc catalog that tells it to use the SGML Declaration for XML, xml.dcl. That SGML Declaration tells nsgmls what hexadecimal character references look like. Without it, things like &x2014; are being interpreted as per ISO 8879:1986, which isn't doing you or the parser any good. Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 19:55:48 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: Message-ID: <36DD9263.F26D063C@eng.sun.com> > > > Put it this way: if you assume UTF-16, you're > > > safe either way because UTF-16 is a superset. > > > > Err ... is that true? > > > > Maybe I'm being a bit obsessive about my > > interpretation of the various standards docs, Given how many folk talk about UCS-2 lately (not many!) that could well be true ... ;-) > > but > > as far as I can see UCS-2 isn't a subset of > > UTF-16. > > The question of UCS-2 being, or not being a subset of > UTF-16 is a bit of a red herring. It is undoubtedly true > that the set of octet pairs which are legal UCS-2 > characters is a subset of the set of octet pairs which > are legal UTF-16 characters. And more to the point, XML processors aren't required to report such low level character encoding errors ... this would be one. > Appendix F suggests that octet sequences which could > equally well be interpreted as UTF-16 or UCS-2 may be > assumed to be UTF-16, and *doesn't* include a clause > stating that this assumption should be revised in > the light of an explicit XML encoding declaration. I > think that clause should be added, in much the same > way as it is for UTF-8 vs. 8859-X. All of appendix F is non-normative; you're free to revise or not, as you see fit, and it won't affect conformance. - Dave > Now the typo ... > > > This very complicated and isn't a zillion miles away > > from the current handling of UTF-8 vs. ISO 8859-x > > vs. US-ASCII. > > Please insert the word 'isn't' in the obvious > place ;-) > > Cheers, > > Miles > > -- > Miles Sabin Cromwell Media > Internet Systems Architect 5/6 Glenthorne Mews > +44 (0)181 410 2230 London, W6 0LJ > msabin@cromwellmedia.co.uk England > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Mar 3 20:32:53 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> Message-ID: <36DD9C3C.99DD529C@goon.stg.brown.edu> David Brownell wrote: > And more to the point, XML processors aren't required > to report such low level character encoding errors ... > this would be one. On the face of things, this doesn't make sense. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 20:53:39 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd References: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com> Message-ID: <36DDA11E.C07163B0@locke.ccil.org> Paul Grosso wrote: > 1. nsgmls is not an XML parser. The version included with SP 1.3 is an XML parser, though not entirely defect-free. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 20:59:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: XML and special Characters : unicode v3.0 ? References: <000d01be64fc$1a3a09e0$0dce6ccb@baden> <36DD523B.F2EAFB7E@locke.ccil.org> <36DD92EC.80B3B3DE@eng.sun.com> Message-ID: <36DDA2A7.8E2F1E01@locke.ccil.org> David Brownell wrote: > Surely it's more important that Klingon markup be supported? :-) All Languages Are Equal (TM). > I notice that a recent Linux distribution puts Klingon support > into a chunk of private use area, so at least there's consistency > that XML doesn't yet offer complete Klingon support! Right. Support for private-use characters in XML names will always be a Bad Thing, because nobody outside the private user can tell which characters are letters and which aren't, so it's either all or none, and "none" is the most sensible choice. Just be prepared to revisit XML so that Unicode 3.0 name and name-start characters can get included. This will allow the creation of DTDs written in serious Real World languages like Macedonian, Syriac, Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native languages, Khmer, Mongolian, and Yi. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Mar 3 21:04:29 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu> Message-ID: <36DDA39D.EA98C73B@locke.ccil.org> Richard L. Goerwitz wrote: > On the face of things, this doesn't make sense. For example, a document containing P and otherwise error-free may be processed without error, although U+0080 is not a legal Unicode character. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 21:06:07 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Java Specification Request for XML Message-ID: <36DD9EA1.2CEE7CEA@eng.sun.com> There seems to have been some confusion regarding what Sun is trying to do with its Java Specification Request for an XML Extension to the Java Platform. A Java Specification Request (JSR) is a request to develop a specification; it is not a specification in itself. What we did a week ago is ask for comments regarding this proposal to begin work on such an XML Extension specification. If this is approved, we will then follow the Java Community Process as described at http://developer.java.sun.com/developer/jcp/ to actually develop that specification. The Java Community Process is an open, inclusive process and we look forward to the active particpation of all interested parties. The process goes forward in several steps: [1] The JSR is presented for comment (as you've seen) [2] The JSR is approved (we hope) [3] An expert group is formed to write the specification; this begins with a "Call for Experts" (CAFE) to participate. [4] The expert group writes a first draft of the specification [5] The draft is circulated to all Java technology licensees and Participants in the Java Community Process. [6] Comments are collected, read, and responded to by the expert group, resulting in an improved specification. [7] The refined specification is then released to the public for comment. [8] Comments from the public are collected, read, and responded to by the expert group, resulting in more refinements. [9] The final specification is produced by the expert group, along with a reference implementation and compatibility tests. The key point is that everyone with internet access will get a chance to review and comment on the emerging specification. Note that the xml-dev community has already had input into the proposed specification as evidenced by the referencing of the SAX specification in the JSR as one of the starting documents. Other specifications could be adopted by the expert group. We look forward to the continued participation of the xml-dev community in this work. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 21:38:27 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu> Message-ID: <36DDAA6F.D432053A@eng.sun.com> "Richard L. Goerwitz" wrote: > > David Brownell wrote: > > > And more to the point, XML processors aren't required > > to report such low level character encoding errors ... > > this would be one. > > On the face of things, this doesn't make sense. For example, character encodings are typically handled many layers below the XML processor. That processor shouldn't be faulted for behaviors of the underlying processor. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Mar 3 21:52:01 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... Message-ID: David Brownell wrote, > "Richard L. Goerwitz" wrote: > > David Brownell wrote: > > > And more to the point, XML processors aren't > > > required to report such low level character > > > encoding errors ... this would be one. > > > > On the face of things, this doesn't make sense. > > For example, character encodings are typically handled > many layers below the XML processor. That processor > shouldn't be faulted for behaviors of the underlying > processor. Most of the time yes ... but remember we're discussing the interaction between encoding detection and encoding _declarations_. An XML processor has to have some involvement in that. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Wed Mar 3 22:03:28 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:38 2004 Subject: Encoding detection again ... References: Message-ID: <36DDB059.550023E7@eng.sun.com> Miles Sabin wrote: > > David Brownell wrote, > > "Richard L. Goerwitz" wrote: > > > David Brownell wrote: > > > > And more to the point, XML processors aren't > > > > required to report such low level character > > > > encoding errors ... this would be one. > > > > > > On the face of things, this doesn't make sense. > > > > For example, character encodings are typically handled > > many layers below the XML processor. That processor > > shouldn't be faulted for behaviors of the underlying > > processor. > > Most of the time yes ... but remember we're discussing > the interaction between encoding detection and encoding > _declarations_. An XML processor has to have some > involvement in that. But the error in question would show up after the encoding declaration had been processed -- well after! -- so the XML processor itself would no longer need involvement. The non-normative "detection" can't involve the error ... surrogates can't appear within encoding declarations. In any case, it's OK for conformant processors to reject UCS-2 out of hand, eliminating all possibility of such an error in any case! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Mar 3 22:24:41 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:38 2004 Subject: Java Specification Request for XML In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com> Message-ID: <199903032224.RAA10719@hesketh.net> At 12:42 PM 3/3/99 -0800, David Brownell wrote: >There seems to have been some confusion regarding what Sun is trying >to do with its Java Specification Request for an XML Extension to the >Java Platform. > >[...] > >The Java Community Process is an open, inclusive process and we >look forward to the active particpation of all interested parties. > >[...detailed list of process steps, excerpted..] > >[4] The expert group writes a first draft of the specification >[5] The draft is circulated to all Java technology licensees and > Participants in the Java Community Process. >[7] The refined specification is then released to the public for > comment. > >The key point is that everyone with internet access will get a >chance to review and comment on the emerging specification. > >Note that the xml-dev community has already had input into the >proposed specification as evidenced by the referencing of the >SAX specification in the JSR as one of the starting documents. >Other specifications could be adopted by the expert group. > >We look forward to the continued participation of the xml-dev >community in this work. This all sounds good, but I remain concerned (and wary) for a number of reasons, and I didn't respond directly to your JSR commenting process because I'm very uncertain about whether this development belongs in a process controlled, however lightly, by a particular vendor. The JCP is only a partially open process, as the sequence of steps above - in which Java technology licensees and 'Participants in the Java Community Process' is step 5 and the public is step 7 - demonstrates. It seems that the licensees and 'official' participants are still privileged, have earlier access to the information, and potentially more impact on its shape. I don't expect to be one of the experts crafting the standard, but I hope to able to participate in the discussions as a real participant and not just another spectator. Given that SAX was developed (and is still developing) in a very open forum, it seems like the JCP is moving into an area that was totally open and moving it to an arena that is _less_ open. There have been a lot of criticisms of W3C process on this list, as I'm sure you've noticed, for similar openness problems. While the W3C does in some way respond to public comments, there's no transparency - we have no way to know how much they care. I'd like to hear Sun make some _strong_ statements that they'll be developing this API in a way more like the SAX process than the DOM process, and that genuine transparency is the goal of the JCP rather than Sun protecting what it sees as its interests in the Java/XML space. I think Sun could make a great contribution here, using its weight in the Java community to help standardize XML processing and make it more universally used, but I hope Sun isn't planning to use that weight to direct the discussion and influence the final decisions unduly. It's promising, but I think there are a lot of folks out here who are very wary. (See Elliotte Rusty Harold's comments at http://metalab.unc.edu/xml for an example.) I'm definitely wary, though I also have some real hopes. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mscardin at us.oracle.com Wed Mar 3 22:50:00 1999 From: mscardin at us.oracle.com (Mark Scardina) Date: Mon Jun 7 17:09:38 2004 Subject: ANN: Oracle XML Class Generator for Java Message-ID: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com> I would like to announce Oracle's second XML component beta release - XML Class Generator for Java - now available for downloading and testing on the Oracle Technology Network (OTN) XML site located at http://technet.oracle.com. The XML Class Generator will generate a set of Java source files based on an input DTD. The generated Java source files can then be used to construct, optionally validate, and print a XML document that is compliant to the DTD specified. This is an early beta release and has the following features: * Creates Java Classes from DTDs to enable the programmatic construction of XML documents. * Supports validation mode to assist debugging. * Works with the Oracle XML Parser in Java. * Creates documents conforming to the W3C XML 1.0 Recommendation. * Supports creating documents in the following encodings: UTF-8 UTF-16 ISO-10646-UCS-2 ISO-10646-UCS-4 US-ASCII EBCDIC-CP-US ISO-8859-1 Shift_SJIS Support is available in the XML Forum on OTN to provide a collaborative area for bug reporting, technical support, and discussing other Oracle/XML issues. This forum will be used for external as well as internal beta testers. Mark V. Scardina Sr. Product Manager - Core Development Server Technologies - Oracle Corporation Oracle XML News http://www.oracle.com/xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mscardin at us.oracle.com Wed Mar 3 23:00:19 1999 From: mscardin at us.oracle.com (Mark Scardina) Date: Mon Jun 7 17:09:38 2004 Subject: ANN: Oracle XML Parser for Java - Preoduction Release Message-ID: <001801be65c9$6513c960$47be1990@mscardin-pc.us.oracle.com> The production release of the Oracle XML Parser for Java is available for download at http://technet.oracle.com/tech/xml. Supports validation and non-validation modes Built-in Error Recovery until fatal error. Supports W3C XML 1.0 Recommendation. Intergrated Document Object Model (DOM) Level 1.0 API Integrated SAX 1.0 API Supports W3C Proposed Recomendation for XML Namespaces Supports documents in the following encodings: UTF-8 BIG 5 UTF-16 GB2312 ISO-10646-UCS-2 EUC-JP ISO-10646-UCS-4 EUC-KR US-ASCII KOI8-R EBCDIC-CP-* ISO-2022-JP ISO-8859-1to -9 ISO-2022-KR Shift_JIS Support is available in the XML Forum on OTN to provide a collaborative area for bug reporting, technical support, and discussing other Oracle/XML issues. This forum will be used for external as well as internal beta testers. Mark V. Scardina Sr. Product Manager - Core Development Server Technologies - Oracle Corporation Oracle XML News http://www.oracle.com/xml xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dante at mstirling.gsfc.nasa.gov Thu Mar 4 15:48:21 1999 From: dante at mstirling.gsfc.nasa.gov (Dante Lee) Date: Mon Jun 7 17:09:38 2004 Subject: HTML Question Message-ID: Can someone look at the source of my web page and tell me why my links are not coming up in the targeted frames? The site is at: http://mstirling.gsfc.nasa.gov/~dante/sharp98 All of the links are targeted to Frame 1, which is specified in the index frame as the frame to the right. However, all of the links pop up as new windows. Please help. I think it has something to do with the javascript in Frame1 (titlebox.html). Thanx. Dante M. Lee Code 588 NASA/GSFC Greenbelt MD 20771 Voice = 301-521-1077 Bldg = 23 Rm = W415 Email = dante@mstirling.gsfc.nasa.gov dante4@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dalapeyre at mulberrytech.com Thu Mar 4 19:53:04 1999 From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre) Date: Mon Jun 7 17:09:38 2004 Subject: DTD for Bibliographic Notation In-Reply-To: References: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1> Message-ID: The journal publishers have taken a cut at bibliographies, for small example: Elsevier's is available on their website (at least it used to be) John Wiley & Sons has one (See WILEY Interscience) CADMUS used to have theirs on their website Ovid has made theirs public as well (but I would not recommend it) PUBMED at NIH/NLM also has a very basic but nice subset (definitely available on their website). --Debbie ====================================================================== Deborah Aleyne Lapeyre mailto:dalapeyre@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9633 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 4 20:15:31 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca> At 03:59 PM 3/3/99 -0500, John Cowan wrote: > This will allow the creation of >DTDs written in serious Real World languages like Macedonian, Syriac, >Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native >languages, Khmer, Mongolian, and Yi. John, this is unfair. All the Macedonians and Sinhalese I've known have an excellent sense of humor. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jpetit at 4thworldtele.com Thu Mar 4 22:05:34 1999 From: jpetit at 4thworldtele.com (John Petit) Date: Mon Jun 7 17:09:39 2004 Subject: XSL Pre-processing Message-ID: <36DEA0E1.1EA10C7E@4thworldtele.com> Is there any software out there that will allow me to do server side XSL preprocessing of XML documents into HTML for display? This is independent of the user's browser. -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 368 bytes Desc: Card for John Petit Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990304/7bd50a3e/vcard.vcf From donpark at quake.net Thu Mar 4 22:06:46 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <00ac01be668b$3abc9f80$2ee044c6@arcot-main> The first draft of the XML Fragment spec allows only one Fragbody. Could someone from the WG shed some light on why this constraint is important? Multi-fragment packages are useful in many situations such as query result representation. Although it is possible to define a packaging mechanism that handles multiple fragments, a fragment context information (FCI) must be provided for each fragment because the spec does not allow FCI to be shared by multiple fragments. A possible example of a multi-fragment package follows: J. R. R. Tolkien The Book of Lost Tales (The History of Middle-Earth) Mass Market Paperback Reprint edition (June 1992) 0345375211 4.79 1 J. R. R. Tolkien The Book of Lost Tales (The History of Middle-Earth) Mass Market Paperback Reprint edition (June 1992) 0345375211 4.79 1 Comments? Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 4 22:19:54 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:39 2004 Subject: XML and special Characters : unicode v3.0 ? References: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca> Message-ID: <36DF06D2.F214E9F9@locke.ccil.org> Tim Bray wrote: > At 03:59 PM 3/3/99 -0500, John Cowan wrote: > > This will allow the creation of > >DTDs written in serious Real World languages like Macedonian, Syriac, > >Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native > >languages, Khmer, Mongolian, and Yi. > > John, this is unfair. All the Macedonians and Sinhalese I've known > have an excellent sense of humor. -Tim Well, several people have believed this was sarcasm on my part. Not so. When I said "serious Real World languages" I meant it. Real people speak, understand, read, and write them in the course of their day-to-day lives. Ancient Egyptian and Sindarin don't fall into this category, no matter that I am an enthusiast of both. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 4 23:00:30 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:39 2004 Subject: Unicode conformance, short version References: <3.0.32.19990301212757.00a2e5b0@pop.intergate.bc.ca> <36DC0627.F2491FA8@locke.ccil.org> <36DC67FB.26E0@w3.org> <199903041310.WAA18593@sh.w3.mag.keio.ac.jp> <14046.38677.746315.899329@localhost.localdomain> Message-ID: <36DF1059.76F2CC4E@locke.ccil.org> Unicode folks have seen this, but XML folks haven't. Here's John's Own Version Of Unicode Conformance: 1) Unicode characters are 16 bits long; deal with it. 2) Byte order is only an issue in files. 3) If you don't have a clue, assume big-endian. 4) Loose surrogates don't mean jack. 5) Neither do U+FFFE and U+FFFF (a.k.a. the zigamorph). 6) Leave the unassigned codepoints alone. 7) It's OK to be ignorant about a character, but not plain wrong. 8) Subsets are strictly up to you. 9) Canonical equivalence matters. 10) Don't garble what you don't understand. This is presented in the hope that it may be useful, but all warranties (including implicit warranties of merchantability or fitness for a particular purpose) are void. Freely reusable, except that John Cowan asserts the moral right to be known as author. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Thu Mar 4 23:03:03 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:39 2004 Subject: I wonder ... In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com> (jes@kuantech.com) Message-ID: <199903042301.SAA01051@ruby.ora.com> [Jeffrey E. Sussna] > This works fine, but (at least in IE 5) only for a single > level. That = is, you can't have another entity reference inside > "book.dtd". To me, = this significantly limits its usefulness > (imagine not allowing a = #include inside a file that was > #included). IE 5 has its parsing errors, but this is not one of them. Error messages I've seen when parsing DocBook indicate that it is definitely following references to multiple levels. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 4 23:16:40 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: Unicode conformance, short version Message-ID: <3.0.32.19990304151441.00c17b10@pop.intergate.bc.ca> At 05:59 PM 3/4/99 -0500, John Cowan wrote: >4) Loose surrogates don't mean jack. There's reason to believe they mean severe breakage upstream, and in mission-critical apps are probably grounds to halt and catch fire. Anyhow, if you're reading a character stream and one of 'em has a value between (decimal) 55296 and 57343 inclusive, it ain't XML any longer. (And I believe all the serious XML processors actually enforce this particular rule). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Thu Mar 4 23:26:39 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <3.0.32.19990304172546.00f10224@pophost.arbortext.com> At 14:06 1999 03 04 -0800, Don Park wrote: >The first draft of the XML Fragment spec allows only one Fragbody. Could >someone from the WG shed some light on why this constraint is important? First, let me remind folks that only comments sent to the archived mail list set up for comments are "officially" considered. The WG cannot promise to honor all requests for responses to questions posted on xml-dev. However, the answer to Don's question will probably address a lot of other questions, and the WG did consider it carefully, so I would like to answer that here. One of the key principals in developing this version of the Fragment Interchange spec was to define and remain within a limited scope. The problem was (1) to define what fragment context information is, (2) to define a fragment context specification notation, and (3) to define at least one interoperable method for associating a fragment context specification with a fragment body. Although we did decide to address point (3) by defining a simple "packaging" scheme, we were very careful to do the minimum necessary to address point (3). Specifically, we did not want to enlarge our scope to include packaging methods in general. It is expected that the XML Activity of the W3C will consider ways to address packaging in the near future, and the XML Fragment WG didn't want to do something that might later constrain a more general solution. Packaging multiple entities in a single unit is likely to be a useful thing to do in general of which packaging multiple fragment bodies is just one example. The WG didn't want to define a way to address multiple fragment bodies and then discover, when the more general problem is carefully considered, that our solution wasn't a subset of the solution to the more general problem. In summary, the WG is aware of lots of improvements, enhancements, and extensions that could be made to an XML Fragment Interchange spec, but we ruthlessly kept ourselves to the "minimum needed to declare victory." We expect work on Schemas and Packaging and XLink and probably other areas will all contribute technology that would be useful in a version 2 XML Fragment Interchange spec someday, but we believe that implementation and user experience should prove the version 1 spec useful before we even think about a version 2. Of course, if you seriously believe that the spec is useless unless it allows multiple fragment bodies per package, then that is a comment you should make and attempt to support. We don't want to come out with a spec folks think is useless, but we were trying to keep it as minimal as possible while still addressing the problem we defined as our scope. paul Paul Grosso Chair, XML Fragment WG xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 5 00:27:45 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:39 2004 Subject: XSL Pre-processing Message-ID: <005901be669e$efbf2520$0300000a@othniel.cygnus.uwa.edu.au> I use James Clark's XT. For a more complete list see http://www.xmlsoftware.com/xsl/ For examples of XSL I use to produce the above site, see http://www.xmlsoftware.com/articles/xsl-by-example.html James -----Original Message----- From: John Petit >Is there any software out there that will allow me to do server side XSL >preprocessing of XML documents into HTML for display? This is >independent of the user's browser. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Mar 5 00:37:58 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:39 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> Paul, >First, let me remind folks that only comments sent to the archived >mail list set up for comments are "officially" considered. The WG >cannot promise to honor all requests for responses to questions posted >on xml-dev. Sorry about that. I couldn't find the e-mail address of the mailing list (W3C site was down) when I sent my message so had to punt into xml-dev. >Of course, if you seriously believe that the spec is useless unless it >allows multiple fragment bodies per package, then that is a comment you >should make and attempt to support. We don't want to come out with a >spec folks think is useless, but we were trying to keep it as minimal >as possible while still addressing the problem we defined as our scope. I found the spec very useful, timely, and clear. It was not my intention to delay, divert, or hamper the progress of the XML Fragment spec. It was also not my intention to imply that the WG overlooked something important. I withdraw my comment since it does not fall under the intended scope of the spec. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 01:16:58 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <000701be66a5$d03f5100$01010101@development.cascade> Forgive me for the generic question, I'm to the point of betting the bank on XML, and I'm looking for a pat on the back, or a voice of warning.... We are starting from scratch on our next generation product, from what I've read and seen - xml seems to fit the bill (Content Management, mixed with WIDL RPC functionality seems right up our alley). I'm looking hard at ODBMS systems and laying out the DB via xml (storing xlm directly). We have a wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML. Do I understand it correctly that I at an item level, I can: 1. name it (URI)? a. possible supply some security to it? 2. revision it? 3. meta-data it? a. can meta-data have meta-data? Would I be foolish to base my whole object system storage on xml, or on ODBMS for that matter? Are they cooked, are they ready for real world apps? Once again, I'm sorry for the generic question, I have read the FAQ's, the ODBMS webpages, several books etc. I'm looking for the advice of those in the trenches - Is it safe to make XML the foundation of my new product? Should I grab a shovel, and jump in the trenches with you, or is this a deep dark hole? Thanks in advance, for all who might reply. Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 5 01:28:11 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <3.0.32.19990304172718.00ba7c80@pop.intergate.bc.ca> At 06:16 PM 3/4/99 -0700, Chad Adams wrote: >Forgive me for the generic question, I'm to the point of betting the bank on >XML, and I'm looking for a pat on the back, or a voice of warning.... You might get more helpful help if you described the problem you're trying to solve. On the other hand, anything that has XML and ODBMS and Java and COM/DCOM in it has to be A Good Thing; ask any analyst or prognosticator. You might have to hire some of those two-headed programmers, though. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Fri Mar 5 01:53:52 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:39 2004 Subject: ModSax Suggestion Message-ID: Hi Everyone, While SAX does a good job as an event-based interface to Parsers, it would be nice to add a few methods to receive a DOM representation back from a reference to an org.xml.sax.Parser. Something like: org.w3c.dom.Document parse(InputSource is, boolean events) throws SAXException; org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws SAXException; /* the events boolean would be to turn on/off event calls. */ If a SAXDriver did not want to produce a DOM, it could either simply return a null or a method added like: boolean isDomCapable(); The above would let me use the ParserFactory to seamlessly switch between Parser implementations and get a DOM tree without building one myself. It is fruitless for me to build a DOM tree when almost all the parser implementations provide that ability. I just want a way to get at that functionality in a simple and standard way (thus SAX). Thoughts? - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Mar 5 01:58:51 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested In-Reply-To: <000701be66a5$d03f5100$01010101@development.cascade> Message-ID: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> I will not comment on the advisability of using an ODBMS, because 1) it's out of scope for this group, and 2) it's a highly religious topic. However, I will comment on the question of whether to store your data directly as XML, and confess that I don't understand the question. XML is a great interchange language; i.e., a way to move data between systems. Generally speaking, however, each particular system has its own optimal internal representation. In an RDBMS, for example, it's tables. In a Java program it's objects, and so forth. There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing). Anyway, my approach would be to store data in the most natural format for the given storage technology, and define translations to and from XML to move data between systems. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Chad Adams Sent: Thursday, March 04, 1999 5:17 PM To: xml-dev@ic.ac.uk Subject: Opinions requested Forgive me for the generic question, I'm to the point of betting the bank on XML, and I'm looking for a pat on the back, or a voice of warning.... We are starting from scratch on our next generation product, from what I've read and seen - xml seems to fit the bill (Content Management, mixed with WIDL RPC functionality seems right up our alley). I'm looking hard at ODBMS systems and laying out the DB via xml (storing xlm directly). We have a wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML. Do I understand it correctly that I at an item level, I can: 1. name it (URI)? a. possible supply some security to it? 2. revision it? 3. meta-data it? a. can meta-data have meta-data? Would I be foolish to base my whole object system storage on xml, or on ODBMS for that matter? Are they cooked, are they ready for real world apps? Once again, I'm sorry for the generic question, I have read the FAQ's, the ODBMS webpages, several books etc. I'm looking for the advice of those in the trenches - Is it safe to make XML the foundation of my new product? Should I grab a shovel, and jump in the trenches with you, or is this a deep dark hole? Thanks in advance, for all who might reply. Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Fri Mar 5 02:18:56 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested Message-ID: <8703ae19.36df3add@aol.com> Hi Chad, In a message dated 3/4/99 8:25:02 PM Eastern Standard Time, cadams@cascadecc.com writes: > Forgive me for the generic question, I'm to the point of betting the bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > Before you bet the bank, you need to make sure you are not dependent on any part of the XML family of specifications that are not complete, nor have a variety of stable implementations from different vendors. XML will revolutionize the web ... but the key word there is "will". A small company cannot afford to wait for a market to mature. As one who has been part of a small company that jumped on a technology too soon in the maturity curve (like Java 1.02), I would recommend caution. Best wishes, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Mar 5 03:31:37 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:39 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> Message-ID: <36DF4CE1.7F4D3681@simdb.com> "Jeffrey E. Sussna" wrote: > There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing). I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities). As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields -- both of these features can be accessed through arbitrary expressions, which can be complete programs, for instance a field can contain every other word of paragraphs whose parent section has a "priority" attribute with a numerical value less than 5; it also provides arbitrary document fragmenting capabilities at the application level). And the weaknesses are not intrinsic to our model -- we have full structure queries slated for the near future (probably in the next six months). SIM is just one of many XDBMS's avilable on the market, and is one of the fastest, if not _the_ fastest, and most scalable available (at the very least, it is a country mile ahead of (R|OO)DBMS's in terms of XML performance, contrary to the ever-popular notion that the latter are inherently faster than the former -- one client, after migrating their application from a popular RDBMS to SIM, removed the stop button from the query dialog because no-one ever got a chance to see it). Anyway, enough shameless marketing: XDBMS's do exist today, and they do support high performance storage, querying and retrieval. Cheers, Marcelo Cantos http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 06:40:43 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested - more detail on what my thinking is. Message-ID: <000101be66d3$172277a0$01010101@development.cascade> What is AFAIK? Maybe I've been confused by ODBMS products/sells documentation. At least three of them that I have looked at (Object Design, Ardent and Poet) seem to have fairly extensive XML API's as well as other tools that support xml storage in their databases. For example, poet supplies a check-in/check-out utility that is used as a version control system for content management of the xml structure stored directly into the DB. They also supply a browser utility that directly accesses the DB, giving an xml tree navigation, and display - I assume they are using something like microsoft's xml parser that renders to html and displays it. I assume that they are providing a set of java classes that model xml which is then stored directly in the DB (feed it an xlm file it stores an xlm object graph representing the document). I assume upon retrieval it simply streams (maybe as simple as toString())it's xml representation to the xml consumer who parses/renders it per dtd or whatever - no conversion processing is needed in the path until the consumer, keeping speed optimal, (pushing expensive parsing work to the client, relieving a busy server with time to dish up more). verses storing some java non-xml object in the database, you then retrieve the object from the database and wrap the information of the object into xml - and then ship it to some xml consumer, who then parses/renders it back into the non-xml objects form. It also seems to me that if the objects that you are storing are not xlm objects, you have lost the concept of Context Management or at least made it more complex to implement. I think this is where "betting the bank" comes in. To architect the system the second way would be to code a class per possible unique xlm element. You would then need to write classes to pull these "atomic" elements together etc. Upon retrieval you would then create the xlm for transport. This would isolate the DB storage and the client from xml because it would be your own animal, giving you extensibility via normal java class programming. To architect with xlm from the bottom up puts XML Content Management at the very root of the design. You are dependant upon the xlm protocol (not your own custom objects) to give you extensibility. Custom tags, meta-data, naming, versioning, whatever else xlm gives you, must be versatile enough to emulate the java class hierarchies of complex inheritance and aggregation graphs (as used in the option above). This allows for the same authoring tools used to develop content, to also develop navigation and other parameters that will be utilized at run time by the consumer of the xlm. I'm assuming the big buy here is code will only need to be written for the authoring tool, and the xlm consumer. All delivery from the db to the client (even via complex n-tier systems) would require very little, or no coding by us. Client code would parse out the displayable portions to html and display it. An applet would obtain the custom tags, meta-data etc. to make runtime decisions on what to do, based on things that could happen as the user interacts with the page. Our need: Author, name, store, revision, reuse, retrieve - pieces of documents, that can then be combined with other documents, which can in turn be combined with others ... Documents are composed of text, video, audio, graphics ... All the goodies of style sheets etc. would be used. Custom xlm tags would not be published at this time - interoperability with the world is not the driving requirement - ease of transporting documentation + special controls from an n-tier DB system running our code to a thin client running our code is. Meta data and custom tags would be used for several reasons - for example; enhance search and selection algorithms for authoring, bury navigational control data/logic that could be used at run time to help select the next element to display, bury management hooks that would trigger widl rpc to other processes based on runtime states etc. I'm also assuming that an ODBMS could deliver up complex linked, deeply nested xlm documents faster than open/read/closing hundreds of possible files to assemble some document. Concurrent open file handles also present a problem ... Have I missed the boat on what the ODBMS companies with XML Content Management Systems have to offer me? Chad > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Jeffrey E. Sussna > Sent: Thursday, March 04, 1999 6:57 PM > To: 'Chad Adams'; xml-dev@ic.ac.uk > Subject: RE: Opinions requested > > > I will not comment on the advisability of using an ODBMS, because > 1) it's out of scope for this group, and 2) it's a highly > religious topic. However, I will comment on the question of > whether to store your data directly as XML, and confess that I > don't understand the question. XML is a great interchange > language; i.e., a way to move data between systems. Generally > speaking, however, each particular system has its own optimal > internal representation. In an RDBMS, for example, it's tables. > In a Java program it's objects, and so forth. There is not > (AFAIK) yet any such thing as an XDBMS (though you could consider > a file system of XML documements plus a web server to resolve > URL's to those documents as such a thing). Anyway, my approach > would be to store data in the most natural format for the given > storage technology, and define translations to and from XML to > move data between systems. > > Jeff > > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Chad Adams > Sent: Thursday, March 04, 1999 5:17 PM > To: xml-dev@ic.ac.uk > Subject: Opinions requested > > > Forgive me for the generic question, I'm to the point of betting > the bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > > We are starting from scratch on our next generation product, from > what I've > read and seen - xml seems to fit the bill (Content Management, mixed with > WIDL RPC functionality seems right up our alley). I'm looking > hard at ODBMS > systems and laying out the DB via xml (storing xlm directly). We have a > wealth of in-house Java and COM/DCOM experience, but none with > ODBMS or XML. > > Do I understand it correctly that I at an item level, I can: > 1. name it (URI)? > a. possible supply some security to it? > 2. revision it? > 3. meta-data it? > a. can meta-data have meta-data? > > Would I be foolish to base my whole object system storage on xml, or on > ODBMS for that matter? Are they cooked, are they ready for real > world apps? > > Once again, I'm sorry for the generic question, I have read the FAQ's, the > ODBMS webpages, several books etc. I'm looking for the advice of those in > the trenches - Is it safe to make XML the foundation of my new product? > > Should I grab a shovel, and jump in the trenches with you, or is > this a deep > dark hole? > > > Thanks in advance, for all who might reply. > > > Chad Adams > Payback Training Systems > Email: cadams@cascadecc.com > Phone: 435-654-6304 > fax: 435-654-1482 > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Chad Adams Payback Training Systems Email: cadams@cascadecc.com Phone: 435-654-6304 fax: 435-654-1482 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Fri Mar 5 07:23:19 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> Message-ID: <36DF864B.B458299D@fiduciary.com> Marcelo Cantos wrote: > "Jeffrey E. Sussna" wrote: > > > There is not (AFAIK) yet any such thing as an XDBMS > > I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities). As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields In addition to this vision of an XML database, there has been much discussion of XML as a front end or a query-and-response framework for data stores, but I would argue that such applications of XML markup are not an XML database. A true XML database is shaped by the essential characteristics of XML itself: it should be freely eXtensible; it should be defined and manipulated by Markup; and it should be cast in a Document Structure within which Elements identify Data Constructs, and Attributes provide Data Characterization. Like XML itself, the XML database is fundamentally mismatched to the familiar storage and transmission frameworks of filesystem, relational table, object serialization or data stream. In the first case, any item--document, data table, or executable--whether 'text' or binary--which is committed to storage in a filesystem is treated as a file: that is, as unitary and indivisible within the perspective and capabilities of the filesystem. A word processing program may, by opening a document, be able to identify and to manipulate as individual elements the sentences, paragraphs and chapters of that document. By contrast, the filesystem in which that document is stored reads, writes, renames, searches for or deletes the document as a whole. In XML terms, the filesystem sees the document as a single element--a root. Regardless of how many subelements we might mark up within that , the filesystem--designed for a generic 'file-like' document, is capable of manipulating only one. In a similar way, a relational table--and the database engine behind it--can store, index, or construct joins upon only those data records which correspond to the schema of the table. While it is possible to use SQL or proprietary database tools to rewrite an existing table to a different schema, that is substantially different from submitting to a database engine, as an entry to a particular table, a single record which follows a unique schema of its own. In the terms of both filesystem and relational table, an XML document is effectively a BLOB, in that its specifically XML structure is outside the ability of either to discern or to make any use of. Just as, for example, with audio or video content more commonly recognized as BLOBs, the filesystem or relational database engine is obliged to invoke a particular, content-specific processor in order to understand, and then to implement, the structure conveyed by markup in every XML document. Yet this need for pre-defined, content-specific handlers obviates the benefits of XML as a general solution. Indeed, it is not really XML at all if the markup possibilities are circumscribed by the need to conform to what a pre-defined handler can implement. XML, by definition, is freely extensible. This fundamental characteristic trumps any hoped-for convenience in processing to be achieved by defining 'standard' tagsets, industry-wide 'domain' procedures, or normative namespace references. That this essential capability of XML is irreconcilably mismatched to conventional filesystems and relational databases means that if we are building true XML tools we are obliged to create new equivalents of the filesystem and the database which do conform to the extensible nature of XML. 'Internally' extensibility means that the structural definition of existing XML documents may be altered at any time by indicating, in a document instance, new subelements of the elements previously defined or, occasionally, consolidating--and eliminating--previously defined elements in favor of more general ones. This is not simple re-arrangement of the elements of an XML document, but a fundamental re-definition of its structure. 'Externally' the extensibility of XML means that documents, arriving from any number of (not necessarily well-known) sources, may claim recognition by our XML database engine and expect, for example, to be accepted as input data, solely because the document root element has a tag which matches one defined in our system. Of course, below that apparently familiar root element may lie subelements whose type we have not seen before, or which are structured in a different hierarchy than we expect, or whose tag names are unfamiliar variants of what we use 'internally'. A true XML database engine must inherently and efficiently handle the demands of both this internal and external extensibility. Effectively this means that the data schema must (potentially) be rewritten with every new 'record' accepted, or altered, in the database. That is, if we posit that those 'records' are XML documents then, as XML documents, they may be marked up at any time to a finer (or coarser) elemental granularity, and a true XML database engine must respond by reading, writing, querying, and generally processing them in sync with the markup. In the case of 'external' items?effectively data entry submitted to the XML database?the database engine must identify the schema with the data source. That is, it must understand that the markup of items originating from one source may be aliases of the markup in documents from another source and, again, may present a finer or coarser elemental granularity than analogous documents from a different source. What is missing in this, of course, is the traditional role of the DTD for validation. It is omitted because XML 1.0 defines two very different markup and processing disciplines, distinguished by whether there is a DTD, and in order to build XML tools it is necessary to choose which of these definitions we are following. XML is routinely introduced as both of its very different selves. Newcomers are usually first lured in with the promise of unlimited markup: define your own tags which exactly suit your unique situation. Only after they have bitten for that bait are they told about the limitations imposed by the DTD. Yet the fact is that XML 1.0 defines one XML in which the DTD is omitted, and a simple and logical projection of that definition leads to an XML where markup is freely extensible and the data schema is what the sum of the markup in the system at any moment implies. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Fri Mar 5 08:50:40 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:40 2004 Subject: Opinions requested In-Reply-To: <36DF864B.B458299D@fiduciary.com> Message-ID: <000301be66e5$33f1c860$01010101@development.cascade> Walter, Thanks for the reply. If I understand what you are saying, it does seem kind of weird that they would spec the DTD instead of just going with the schema - since that's what schema is for. Also, having taken the bait, my assumption was that any given xml document might be a mixture of both (ie. several dtd schemes + several free floating custom tags with schema all mixed into one happy root) If the consumer of the file knows what they are looking at (either dtd or custom tag wise)- doit, otherwise ignore it. Is it not this simple? Your paragraph on "XML, by definition, is freely extensible ..." as well as the following paragraph describes what I hope the XLM Content Management classes supplied by the ODBMS manufactures would do for me. I'm not sure if this is considered "overloading" the functionality of Content Management, but I believe is one of the concepts of XML. I not only want the implied authoring flexibility of content management (arrange text, video, audio, graphics etc. into segments and sub-segments) on the data store side, but also to embed custom elements (in or around the displayable elements) that determine some runtime programmatic behavior of the consumer of the document. As yet another overloading but as a secondary functionality to the content management, I'm also hoping that the use of XML can be used in what you have implied might be an impure use - that of a query-and-response mechanism. If I can avoid licensing yet another product, to get mine to market ie. objectspace, weblogic, or coding to rmi or some other remoting technology, happy day! Am I looking for the silver bullet that does not exist? Chad > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > W. E. Perry > Sent: Friday, March 05, 1999 12:23 AM > To: xml-dev@ic.ac.uk > Subject: Re: Opinions requested > > > Marcelo Cantos wrote: > > > "Jeffrey E. Sussna" wrote: > > > > > There is not (AFAIK) yet any such thing as an XDBMS > > > > I am continually surprised to hear remarks such as this. SIM > _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with > structure and full content query capabilities). As an XDBMS it > has weaknesses (it only supports predefined indexes and limited > structure querying), but in some ways provides a model that is > even richer than XML (it provides structure below element level, > and has the concept of fields > > In addition to this vision of an XML database, there has been > much discussion of XML as a front end or a query-and-response > framework for data stores, but I would argue that such > applications of XML markup are not an XML database. A true XML > database is shaped by the essential characteristics of XML > itself: it should be freely eXtensible; it should be defined and > manipulated by Markup; and it should be cast in a Document > Structure within which Elements identify Data Constructs, and > Attributes provide Data Characterization. > > Like XML itself, the XML database is fundamentally mismatched to > the familiar storage and transmission frameworks of filesystem, > relational table, object serialization or data stream. In the > first case, any item--document, data table, or > executable--whether 'text' or binary--which is committed to > storage in a filesystem is treated as a file: that is, as > unitary and indivisible within the perspective and capabilities > of the filesystem. A word processing program may, by opening a > document, be able to identify and to manipulate as individual > elements the sentences, paragraphs and chapters of that document. > By contrast, the filesystem in which that document is stored > reads, writes, renames, searches for or deletes the document as a > whole. In XML terms, the filesystem sees the document as a single > element--a root. Regardless of how many subelements we might mark > up within that , the > filesystem--designed for a generic 'file-like' document, is > capable of manipulating only one. > > In a similar way, a relational table--and the database engine > behind it--can store, index, or construct joins upon only those > data records which correspond to the schema of the table. While > it is possible to use SQL or proprietary database tools to > rewrite an existing table to a different schema, that is > substantially different from submitting to a database engine, as > an entry to a particular table, a single record which follows a > unique schema of its own. > > In the terms of both filesystem and relational table, an XML > document is effectively a BLOB, in that its specifically XML > structure is outside the ability of either to discern or to make > any use of. Just as, for example, with audio or video content > more commonly recognized as BLOBs, the filesystem or relational > database engine is obliged to invoke a particular, > content-specific processor in order to understand, and then to > implement, the structure conveyed by markup in every XML > document. Yet this need for pre-defined, content-specific > handlers obviates the benefits of XML as a general solution. > Indeed, it is not really XML at all if the markup possibilities > are circumscribed by the need to conform to what a pre-defined > handler can implement. > > XML, by definition, is freely extensible. This fundamental > characteristic trumps any hoped-for convenience in processing to > be achieved by defining 'standard' tagsets, industry-wide > 'domain' procedures, or normative namespace references. That this > essential capability of XML is irreconcilably mismatched to > conventional filesystems and relational databases means that if > we are building true XML tools we are obliged to create new > equivalents of the filesystem and the database which do conform > to the extensible nature of XML. 'Internally' extensibility means > that the structural definition of existing XML documents may be > altered at any time by indicating, in a document instance, new > subelements of the elements previously defined or, occasionally, > consolidating--and eliminating--previously defined elements in > favor of more general ones. This is not simple re-arrangement of > the elements of an XML > document, but a fundamental re-definition of its structure. > 'Externally' the extensibility of XML means that documents, > arriving from any number of (not necessarily well-known) sources, > may claim recognition by our XML database engine and expect, for > example, to be accepted as input data, solely because the > document root element has a tag which matches one defined in our > system. Of course, below that apparently familiar root element > may lie subelements whose type we have not seen before, or which > are structured in a different hierarchy than we expect, or whose > tag names are unfamiliar variants of what we use 'internally'. > > A true XML database engine must inherently and efficiently handle > the demands of both this internal and external extensibility. > Effectively this means that the data schema must (potentially) be > rewritten with every new 'record' accepted, or altered, in the > database. That is, if we posit that those 'records' are XML > documents then, as XML documents, they may be marked up at any > time to a finer (or coarser) elemental granularity, and a true > XML database engine must respond by reading, writing, querying, > and generally processing them in sync with the markup. In the > case of 'external' items?effectively data entry submitted to the > XML database?the database engine must identify the schema with > the data source. That is, it must understand that the markup of > items originating from one source may be aliases of the markup in > documents from another source and, again, may present a finer or coarser > elemental granularity than analogous documents from a different source. > > What is missing in this, of course, is the traditional role of > the DTD for validation. It is omitted because XML 1.0 defines two > very different markup and processing disciplines, distinguished > by whether there is a DTD, and in order to build XML tools it is > necessary to choose which of these definitions we are following. > XML is routinely introduced as both of its very different selves. > Newcomers are usually first lured in with the promise of > unlimited markup: define your own tags which exactly suit your > unique situation. Only after they have bitten for that bait are > they told about the limitations imposed by the DTD. Yet the fact > is that XML 1.0 defines one XML in which the DTD is omitted, and > a simple and logical projection of that definition leads to an > XML where markup is freely extensible and the data schema is what > the sum of the markup in the system at any moment implies. > > Respectfully, > > Walter Perry > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From s861766 at mail86.yzu.edu.tw Fri Mar 5 09:31:53 1999 From: s861766 at mail86.yzu.edu.tw (Ephese Yang) Date: Mon Jun 7 17:09:40 2004 Subject: A question about XSL/IE5... Message-ID: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw> Hi: I am new in xsl and I have some question about xsl and IE5. Does IE5 beta2 support the flow object in xsl spec.?     ex:  fo:block How can I display a figure in xml file using xsl? Can somebody give me an example? Thanks! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From santi at qsystems.es Fri Mar 5 09:59:10 1999 From: santi at qsystems.es (Santi) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. Message-ID: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS> Hello, I've started some days ago in XML. Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. Thank you very much in advance. Santi Rivas santi@qsystems.es xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david.hitch at dial.pipex.com Fri Mar 5 10:21:55 1999 From: david.hitch at dial.pipex.com (David Hitchcock) Date: Mon Jun 7 17:09:40 2004 Subject: XML tutorial Message-ID: <01be66e9$3f1d23c0$0100007f@ketlux03> Hi Santi We have a number of resources including links to tutorials on the El.pub website at: http://www.pira.co.uk/IE . The XML material is on the standards page: http://www.pira.co.uk/IE/top011a.htm and there is also a comprehensive list of commercial and shareware products on the products page: http://www.pira.co.uk/IE/base09.htm#SGML You may also wish to sign up for the free weekly information service: El.pub Weekly which keeps you informed on a weekly basis of updated news on the site. You can subscribe from the welcome page at: http://www.pira.co.uk/IE The site is run by IESERV2 which supports the advanced electronic publishing research and development projects throughout Europe, run by the Information Engineering sector of the European Commission's DG XIII/E under the Telematics Applications Programme. Best --> David ********************************* David Hitchcock IESERV2 tel: +44/ (0)181 255 7084 +44/ (0)181 255 7085 email: david.hitch@dial.pipex.com web: http://www.pira.co.uk/IE ********************************* El.pub: http://www.pira.co.uk/IE Interactive publishing - news and resources **Join our developing community subscribe to the *NEW* El.pub Weekly a *free* text email update service which includes the week's news items and associated URLs** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 5 12:44:20 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. Message-ID: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au> >I've started some days ago in XML. >Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. see http://www.xmlinfo.com/newcomers/ for links introducing XML. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Fri Mar 5 12:47:13 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:09:40 2004 Subject: XML Tutorial. In-Reply-To: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS> Message-ID: On Fri, 5 Mar 1999, Santi wrote: > Hello, > > I've started some days ago in XML. > Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful. IBM has a nice XML tutorial at: http://www.software.ibm.com/xml/education/tutorial-prog/writing.html You may also find other useful introductions in the list at: http://www.oasis-open.org/cover/xmlIntro.html -robin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Fri Mar 5 13:44:01 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:40 2004 Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd In-Reply-To: Your message of Wed, 03 Mar 1999 14:30:52 -0500. Message-ID: <199903051343.OAA07560@brown.informatik.uni-dortmund.de> > Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls > reads the xml.soc catalog that tells it to use the SGML Declaration > for XML, xml.dcl. That SGML Declaration tells nsgmls what hexadecimal > character references look like. Without it, things like &x2014; are > being interpreted as per ISO 8879:1986, which isn't doing you or the > parser any good. > > Regards, > > > Tony Graham Thanks to everybody who answered my question. Thanks to Tony. Yes, you're right, with -c... it works. I'm was bit confused about that because I have 'Set the SGML_CATALOG_FILES environment variable to point to the file pubtext/xml.soc' as explained in http://www.jclark.com/sp/xml.htm. In fact I used my own old catalog file. Now I checked the xml.dcl that I used and the one that is part of sp: Unfortunately I used "ISO 8879:1986 (ENR)" instead of "ISO 8879:1986 (WWW)" :-( Thanks for your help! Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 5 15:53:32 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:40 2004 Subject: XML MULTI-Fragment Interchange? In-Reply-To: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> Message-ID: <199903051526.KAA17396@hesketh.net> At 04:37 PM 3/4/99 -0800, Don Park wrote: >>Of course, if you seriously believe that the spec is useless unless it >>allows multiple fragment bodies per package, then that is a comment you >>should make and attempt to support. We don't want to come out with a >>spec folks think is useless, but we were trying to keep it as minimal >>as possible while still addressing the problem we defined as our scope. > > >I found the spec very useful, timely, and clear. It was not my intention to >delay, divert, or hamper the progress of the XML Fragment spec. It was also >not my intention to imply that the WG overlooked something important. > >I withdraw my comment since it does not fall under the intended scope of the >spec. While you may be withdrawing the comment because of the scope the XML Fragment group has set itself, we still need a way to represent multiple fragments, whether or not the W3C considers that appropriate to the scope of this particular working group. Sounds like we need to get the XML streaming thread going again, and start working out ways to represent multiple documents/fragments. It seems like a real need. Is anyone interested in this issue going to be at XTech next week? It'd be culture shock to actually talk, I know, but that might be a good place to get a spec for these streaming XML issues kickstarted. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From asmith at drumbeat.com Fri Mar 5 17:19:34 1999 From: asmith at drumbeat.com (Smith, Adrian) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested Message-ID: <70B92603FC2CD21197D600609778A80D0AE64D@elemental2> There actually is an XDBMS. It predates XML. This dates back to around 1965/1966. The database created was titled "IMS" for Information Management System, it was created by IBM and used an hierarchical model for the data. It had all the same characterstics of XML with almost the exact same set of constructs and shortcomings. Thanks! Adrian Worthless. -Sir George Bidell Airy, KCB, MA, LLD, DCL, FRS, FRAS (Astronomer Royal of Great Britain), estimating for the Chancellor of the Exchequer the potential value of the "analytical engine" invented by Charles Babbage, September 15, 1842. > -----Original Message----- > From: Jeffrey E. Sussna [SMTP:jes@kuantech.com] > Sent: Thursday, March 04, 1999 5:57 PM > To: 'Chad Adams'; xml-dev@ic.ac.uk > Subject: RE: Opinions requested > > I will not comment on the advisability of using an ODBMS, because 1) > it's out of scope for this group, and 2) it's a highly religious > topic. However, I will comment on the question of whether to store > your data directly as XML, and confess that I don't understand the > question. XML is a great interchange language; i.e., a way to move > data between systems. Generally speaking, however, each particular > system has its own optimal internal representation. In an RDBMS, for > example, it's tables. In a Java program it's objects, and so forth. > There is not (AFAIK) yet any such thing as an XDBMS (though you could > consider a file system of XML documements plus a web server to resolve > URL's to those documents as such a thing). Anyway, my approach would > be to store data in the most natural format for the given storage > technology, and define translations to and from XML to move data > between systems. > > Jeff > > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf > Of > Chad Adams > Sent: Thursday, March 04, 1999 5:17 PM > To: xml-dev@ic.ac.uk > Subject: Opinions requested > > > Forgive me for the generic question, I'm to the point of betting the > bank on > XML, and I'm looking for a pat on the back, or a voice of warning.... > > We are starting from scratch on our next generation product, from what > I've > read and seen - xml seems to fit the bill (Content Management, mixed > with > WIDL RPC functionality seems right up our alley). I'm looking hard at > ODBMS > systems and laying out the DB via xml (storing xlm directly). We have > a > wealth of in-house Java and COM/DCOM experience, but none with ODBMS > or XML. > > Do I understand it correctly that I at an item level, I can: > 1. name it (URI)? > a. possible supply some security to it? > 2. revision it? > 3. meta-data it? > a. can meta-data have meta-data? > > Would I be foolish to base my whole object system storage on xml, or > on > ODBMS for that matter? Are they cooked, are they ready for real world > apps? > > Once again, I'm sorry for the generic question, I have read the FAQ's, > the > ODBMS webpages, several books etc. I'm looking for the advice of > those in > the trenches - Is it safe to make XML the foundation of my new > product? > > Should I grab a shovel, and jump in the trenches with you, or is this > a deep > dark hole? > > > Thanks in advance, for all who might reply. > > > Chad Adams > Payback Training Systems > Email: cadams@cascadecc.com > Phone: 435-654-6304 > fax: 435-654-1482 > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Veillard at w3.org Fri Mar 5 17:20:30 1999 From: Daniel.Veillard at w3.org (Daniel Veillard) Date: Mon Jun 7 17:09:41 2004 Subject: XML MULTI-Fragment Interchange? In-Reply-To: <199903051526.KAA17396@hesketh.net>; from Simon St.Laurent on Fri, Mar 05, 1999 at 10:29:09AM -0500 References: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> <199903051526.KAA17396@hesketh.net> Message-ID: <19990305121926.E22737@w3.org> On Fri, Mar 05, 1999 at 10:29:09AM -0500, Simon St.Laurent wrote: > At 04:37 PM 3/4/99 -0800, Don Park wrote: > >I withdraw my comment since it does not fall under the intended scope of the > >spec. > > While you may be withdrawing the comment because of the scope the XML > Fragment group has set itself, we still need a way to represent multiple > fragments, whether or not the W3C considers that appropriate to the scope > of this particular working group. > > Sounds like we need to get the XML streaming thread going again, and start > working out ways to represent multiple documents/fragments. It seems like > a real need. Hum, I have been following the streaming/fragment thread. However I have the feeling that even multiple fragment body extensions would not solve the problem you were facing. If I didn't get the discussion wrong, it seems that you rather tried to make one very big (i.e. stream) document from multiple sources while the scope of the fragment work was just the opposite, i.e. how to extract and ship a piece of a very big document. > Is anyone interested in this issue going to be at XTech next week? It'd be > culture shock to actually talk, I know, but that might be a good place to > get a spec for these streaming XML issues kickstarted. I will be around, Daniel -- [Yes, I have moved back to France !] Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux, WWW, rpmfind, Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | rpm2html, XML, http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Fri Mar 5 17:26:09 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:41 2004 Subject: A question about XSL/IE5... In-Reply-To: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw> (message from Ephese Yang on Fri, 05 Mar 1999 16:53:52 +0800) Message-ID: <199903051532.KAA27149@ruby.ora.com> [Ephese Yang] > I am new in xsl and I have some question about xsl and IE5. > Does IE5 beta2 support the flow object in xsl spec.? >     ex:  fo:block IE5 does not support XSL formatting objects. Tell Microsoft you are interested that it do so. XSL questions are best discussed on the xsl-list: . > How can I display a figure in xml file using xsl? > Can somebody give me an example? Since MSIE can only display HTML, try creating an HTML element in your stylesheet. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Fri Mar 5 17:48:34 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <36DF4CE1.7F4D3681@simdb.com> References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> Message-ID: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote: >>"Jeffrey E. Sussna" wrote: >> >> There is not (AFAIK) yet any such thing as an XDBMS (though you could consider >>a file system of XML documements plus a web server to resolve URL's to those >>documents as such a thing). > >I am continually surprised to hear remarks such as this. SIM _is_ an XDBMS >(it is also an SGML, MARC, RTF, etc. database with structure and full content >query capabilities). I think one of the reasons you hear these kinds of remarks is that the terminology surrounding these systems is used differently by different folks. For instance, from what I know of SIM, I wouldn't call it a DBMS system of any kind, as I don't believe (I could be wrong) it supports referential integrity constraints, concurrency control, recoverable transactions, and other features I would expect out of a reasonable DBMS. Granted it has hooks that allow you to get it to work with a DBMS that can provide all that, but that doesn't make SIM itself a DBMS. I would instead class SIM as an information retrieval system, and a pretty damned good one at that. However, SIM performs as well as it does in great part because it's not doing the extra work that a DBMS should do, and which add greatly to retrieval time from database systems (as well as limiting their ability to handle complex data formats gracefully). This isn't to knock SIM; anyone who needs a flexible information retrieval system should be taking a very serious look at it. The Z39.50 support alone puts it way ahead of the market as far as I'm concerned. But I don't think SIM is evidence that there are DBMS systems that handle SGML/XML well; I don't think they do. Oracle may very well be getting there with its latest release, but I suspect there's still a lot of work to be done there. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 5 21:20:14 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:41 2004 Subject: Tell the world about your new language Message-ID: <3.0.32.19990305131959.00b65280@pop.intergate.bc.ca> Check out: http://www.usenix.org/events/dsl99/ -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Mar 5 22:41:58 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:41 2004 Subject: ModSax Suggestion References: Message-ID: <36E05C67.607F4C27@eng.sun.com> Interesting suggestion for a big hole in the parts of the Java API set that are more or less "standard" at this poit -- SAX and DOM. One comment though: I've found that it's important to be able to have options controlling how the DOM tree is built. For example, whether to discard ignorable spaces, or do namespace conformance enforcement, or try to get CDATA sections (comments, etc). Accordingly, I think being able to do a bit more than this will be important. - Dave MikeDacon@aol.com wrote: > > Hi Everyone, > > While SAX does a good job as an event-based interface > to Parsers, it would be nice to add a few methods to > receive a DOM representation back from a reference to an org.xml.sax.Parser. > > Something like: > > org.w3c.dom.Document parse(InputSource is, boolean events) throws > SAXException; > org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws > SAXException; > /* the events boolean would be to turn on/off event calls. */ > > If a SAXDriver did not want to produce a DOM, it could either simply > return a null or a method added like: > > boolean isDomCapable(); > > The above would let me use the ParserFactory to seamlessly switch > between Parser implementations and get a DOM tree without building > one myself. It is fruitless for me to build a DOM tree when almost all > the parser implementations provide that ability. I just want a way to get > at that functionality in a simple and standard way (thus SAX). > > Thoughts? > > - Mike > ----------------------------------------------- > Michael C. Daconta > Author of Java 2 and JavaScript for C/C++ Programmers > Author of C++ Pointers and Dynamic Memory Management > Sun Certified Java Programmer and Developer > http://www.gosynergy.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From zmin at atpage.com Sat Mar 6 01:04:43 1999 From: zmin at atpage.com (min zheng) Date: Mon Jun 7 17:09:41 2004 Subject: Accessing DTD info. in IE5 References: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com> Message-ID: <002d01be676d$eee8e850$f66f6f0a@atpage> Is DTD information accessable through IE5 DOM? I took is as granted because I could do it with old MSXML for java used in IE4. However, when I really wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD information exposed in IE5 DOM? Thanks, Min xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Mar 6 06:45:35 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <36DF864B.B458299D@fiduciary.com>; from W. E. Perry on Fri, Mar 05, 1999 at 02:22:51AM -0500 References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com> Message-ID: <19990306153959.A22308@io.mds.rmit.edu.au> Thank you, Walter for the erudite response. I am left in a bit of quandary as to how or even whether to respond. This is in large part due to the fact that, while your post was in response to mine, it is not immediately clear to me whether you are addressing my comments specifically or rather the general theme of this thread. Having the vague impression (though no firm conviction) that it is in response to my claims that you waxed eloquent on the theme of what defines an XML database, I will proceed to provide commentary, and occasionally direct response/rebuttal, to a smattering of your points. My humble apologies, Walter, if I have in any way misconstrued your post. On Fri, Mar 05, 1999 at 02:22:51AM -0500, W. E. Perry wrote: > Marcelo Cantos wrote: > > > "Jeffrey E. Sussna" wrote: > > > > > There is not (AFAIK) yet any such thing as an XDBMS > > > > I am continually surprised to hear remarks such as this. SIM _is_ > > an XDBMS (it is also an SGML, MARC, RTF, etc. database with > > structure and full content query capabilities). As an XDBMS it > > has weaknesses (it only supports predefined indexes and limited > > structure querying), but in some ways provides a model that is > > even richer than XML (it provides structure below element level, > > and has the concept of fields > > In addition to this vision of an XML database, there has been much > discussion of XML as a front end or a query-and-response framework > for data stores, but I would argue that such applications of XML > markup are not an XML database. A true XML database is shaped by the > essential characteristics of XML itself: it should be freely > eXtensible; it should be defined and manipulated by Markup; and it > should be cast in a Document Structure within which Elements > identify Data Constructs, and Attributes provide Data > Characterization. It seems here that I may have provided an incorrect characterisation of what we do, and hence given Walter cause to provide some qualifiers on anyone wishing to define themselves as an XML database. On this point, I must make it quite clear that SIM is _not_ an XML front end to a data store. It is an XML (etc.) document repository. One additional, crucial point is that SIM _is_ extensible (though I will qualify this presently). It can be defined to accept markup to any degree of strictness or laxity (within the bounds of well-formedness or validity, of course). It can be setup to accept any and all markup and do _something_ intelligent with it. It can also be configured to make stringent demands (well in excess of the DTD, both with respect to strictness and complexity of constraints) of its inputs. This quality of SIM renders the product amenable to both of the major application streams of XML: data and documents. It can provide strict data validation as well as extensibility. Now, by way of qualification, SIM does not provide free-form runtime extensibility (runtime from the administrator's perspective, not ours). Rather it provides the application developer with the requisite tools to define, at design time, what structures will be supported. For instance, you cannot, with SIM, perform queries such as, "find me all sections containing subsections with an attribute of security="public" and at least one paragraph with fewer than four words in it" The semantic complexity of such a query is beyond the scope of our product. However, if one were to know in advance that queries about the minimum paragraph length in public subsections will be commonplace in the particular application one is developing, then SIM could, at design time, be told to create an appropriate index and then the above query could, indeed, be performed. In short, SIM _is_ extensible, but the extensibility is bound somewhat earlier than runtime. In practice, clients never complain about this quality. In fact, it is usually a benefit rather than a hindrance, for the same reason that compile time type checking is a good thing to have in a programming language. I also take issue with Walter's remark that an XML database should be manipulated by and defined through the medium of XML. This sounds analogous to suggesting that relational databases should be defined and manipulated by markup. Now, it is true that relational schema are, themselves, typically stored as relations (one will, for example, find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc. inside a database). However, it seems to me patently absurd to suggest that SQL (whether DML or DDL) be expressed in terms of tuples and relations. Now, while it does not seem likewise absurd to suggest that XML queries and data definition constructs be defined as XML, the truth of such a suggestion is anything but self-evident. Why should one not use an SQL-like language to define and query XML databases? There may or may not be merit in such an approach, but it seems no more or less appropriate than a query/data definition language cast in XML. Indeed, many of the query language position papers at W3C do not use XML syntax. Data definition and query languages are meta-constructs. They are not part of the data, but rather operate on the data and structures. This suggests that while it may be possible to fold the system in on itself by expressing meta-structure as data, it would be unwise to proceed down this path in _a priori_ fashion (Now, have I completely missed Walter's point here? I'm not sure.) > Like XML itself, the XML database is fundamentally mismatched to the > familiar storage and transmission frameworks of filesystem, > relational table, object serialization or data stream. In the first > case, any item--document, data table, or executable--whether 'text' > or binary--which is committed to storage in a filesystem is treated > as a file: that is, as unitary and indivisible within the > perspective and capabilities of the filesystem. A word processing > program may, by opening a document, be able to identify and to > manipulate as individual elements the sentences, paragraphs and > chapters of that document. By contrast, the filesystem in which > that document is stored reads, writes, renames, searches for or > deletes the document as a whole. In XML terms, the filesystem sees > the document as a single element--a root. Regardless of how many > subelements we might mark up within that , the > filesystem--designed for a generic 'file-like' document, is capable > of manipulating only one. One must be careful, here, to discriminate between interfaces and implementations. I basically agree with all of Walter's points in the above paragraph, but would add that many systems store conceptual XML documents as files. Our system uses a highly tuned variable length record manager (unsurprisingly named the VLRM) to store documents and fragments of any size in a highly efficient manner (both in terms of size and speed). Consequently, we store entire documents for the most part. If parsing time starts to weigh heavily due to retrieval of excessively large documents (the entire Australian Tax Legislation, say, or a complete Boeing Aircraft Maintanence Manual), then we fragment the documents to a level where parsing is no longer a bottleneck. In all of this, however, SIM can always treat the XML as XML. The developer always sees trees, not files, or BLOB's. It doesn't matter how it is stored in the background, that is an implementation issue. The one caveat with our product is that fragmented documents cannot be treated as a conceptual whole without physically rejoining the parts. This is one thing which OODBMS's do better than us present, though we are looking at ways to provide that additional level of abstraction (we are also considering the usefulness of doing so, since fragments are more commonly the unit of interest, rather than the entire document). > In the terms of both filesystem and relational table, an XML > document is effectively a BLOB, in that its specifically XML > structure is outside the ability of either to discern or to make any > use of. Just as, for example, with audio or video content more > commonly recognized as BLOBs, the filesystem or relational database > engine is obliged to invoke a particular, content-specific processor > in order to understand, and then to implement, the structure > conveyed by markup in every XML document. Yet this need for > pre-defined, content-specific handlers obviates the benefits of XML > as a general solution. Indeed, it is not really XML at all if the > markup possibilities are circumscribed by the need to conform to > what a pre-defined handler can implement. I disagree with the last sentence above. Not from the pedagogical perspective (which seems quite evident in Walter's prose, and with which I largely sympathise), but from the pragmatic perspective. Yes, the purist will rightly decry the notion of predefinition of structure in an ostensibly XML-friendly environment, but the end-user comes along and not only accepts, but vociferously demands that his environment be constrained. The user doesn't want flexibility to store anything, she wants the flexibility only to store what she wants to store. The serious user of XML does not have a heterogeneous collection of vaguely defined documents with a motley crew of DTD's and well-formed markup. Most users have a well defined data set for which they want to define efficient structures for storage and retrieval (if they aren't interested in efficiency then their problem isn't particularly interesting -- any tool will do). In the few cases where they do have arbitrary structure to deal with, more often than not they are only interested in the content and are likely to throw the structure away. After all, what is the use of structure if you don't know, say, whether the prolog element contains an abstract element, or whether "date" attributes refer to creation time, last modification time, or effectivity (or, worse still, whether they are in U.S., Australian or international format)? In the real world, I suspect that cases where structure is arbitrary but important will be few and far between. This is borne out by the almost complete absense of demand for arbitrary structure querying capability from our clients or potential clients. It just never seems to be an issue. A qualifier is also in order for the above remarks, lest there be a misunderstanding. XML tools, in general, must be extensible and accept any and all valid and/or well-formed inputs. My comments specifically address the issue of repositories (DBMS's). XML may be extensible, but it, too, expresses the notion of constraint through the concept of DTD's. Databases, likewise, not only can, but should constraint the inputs, both for simplicity and efficiency. Perhaps this is, after all, what Walter meant when repudiating the idea of predefined handlers. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Mar 6 08:46:24 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:41 2004 Subject: Opinions requested In-Reply-To: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>; from Jerome McDonough on Fri, Mar 05, 1999 at 09:37:29AM -0800 References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> Message-ID: <19990306154022.B22308@io.mds.rmit.edu.au> On Fri, Mar 05, 1999 at 09:37:29AM -0800, Jerome McDonough wrote: > At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote: > >>"Jeffrey E. Sussna" wrote: > >> > >> There is not (AFAIK) yet any such thing as an XDBMS (though you > >> could consider a file system of XML documements plus a web server > >> to resolve URL's to those documents as such a thing). > > > >I am continually surprised to hear remarks such as this. SIM _is_ > >an XDBMS (it is also an SGML, MARC, RTF, etc. database with > >structure and full content query capabilities). > > I think one of the reasons you hear these kinds of remarks is that > the terminology surrounding these systems is used differently by > different folks. For instance, from what I know of SIM, I wouldn't > call it a DBMS system of any kind, as I don't believe (I could be > wrong) it supports referential integrity constraints, concurrency > control, recoverable transactions, and other features I would expect > out of a reasonable DBMS. Granted it has hooks that allow you to > get it to work with a DBMS that can provide all that, but that > doesn't make SIM itself a DBMS. I would instead class SIM as an > information retrieval system, and a pretty damned good one at that. > However, SIM performs as well as it does in great part because it's > not doing the extra work that a DBMS should do, and which add > greatly to retrieval time from database systems (as well as limiting > their ability to handle complex data formats gracefully). Thank you, Jerome, for the candid and quite fair assessment of SIM. On the point of referential integrity, you are quite right, there is no built in support. Though with our new event hook mechanism (similar to the triggers found in most relational systems) one will be able to attach event handlers to various update operations, and prevent them from completing in the event of a referential integrity violation. This probably wouldn't work together with concurrency controls (thought this will be moot when transaction support comes in). However, in one particular project, we have put in referential integrity control using a single query per reference as part of the check-in mechanism. Another project only generates references dynamically at query time effectively with a single reverse-reference index lookup at query time. The problem with referential integrity checking is sometimes you need to be able to manage broken data and this is more often the case with documents than with the more typical applications of RDBMS technology (financial transactions etc). Of course when you store whole documents instead of unnaturally breaking them up into millions of tiny pieces, you don't have nearly the same referential integrity problems in the first place. With respect to concurrency control you are mistaken. We support short term locks, which prevent individual records, at least, from ever entering an undefined state under concurrent loads. These locks can be held as long as desired, but cannot persist beyond the lifetime of a session. Long term locks (which outlive the session) are in the offing, and stand a good chance of getting into release 3.0 (scheduled for mid-year, I think -- it could be earlier). Transactions we most definitely do not support. We do, however, provide recovery through log files, which record server activity and can be played back in a batch load operation. It's a little crude (you make the server read-only, back it up, and start a new log file. When you crash, restore the last backup and replay the log) but it is safe and effective. More important than any specifics, however, is the issue of what you call a DBMS. To me, a DBMS is a database management system (seems painfully obvious, but I think it bears repeating). You may argue that a product is not a DBMS if it does not support feature X, and I don't entirely disagree. When one talks of a DBMS one is conjuring up a certain image in the mind of the listener, and that image may well include feature X. To be fair to SIM, however, the essence of a DBMS is that it manages a collection of data. If it doesn't support transactions, this does not entail that it does not manage data. Rather it simply has limits on the way the data is managed (i.e. it doesn't manage data as well as one would like). You clearly believe that transaction support is part of the essence of what makes a DBMS. I disagree, indeed, I profoundly disagree. There is nothing in the concept of a database that mandates any such requirement. Rather I would say that transaction support is an important issue for any _good_ DBMS. Likewise for referential integrity and concurrency (and, for that matter, support for declarative queries, use of indexes, a rich set of fundamental data types, etc.). If I recall correctly, dBase III was generally acknowledged to be a DBMS though it lacked most of these requirements, and could barely even call itself relational! Now, don't get me wrong here. I am not trying to defend SIM by deprecating the features you demand. They are very important and highly desirable features in a DBMS (the fact that they are amazingly difficult to do well is of no concern to the user). Their absence in SIM is of ongoing concern to us. Furthermore it is far from satisfying to be able to insist that, SIM fits into a strict, minimalist definition of a DBMS if it lacks features that are typically associated with DBMS's. One of the primary reasons they are not in at this stage is that, as you pointed out so well, the primary focus of SIM has always been performance and scalability; and all of the aforementioned features can have a significant impact on performance if implemented naively (transaction support, in particular, is an onerous requirement, though by no means untenable). SIM is not a full featured DBMS. But it is not a mere informaton retrieval system either. It does support recovery (though not full transaction support), it does support concurrency, and it can be coerced to support referential integrity. It also bears mentioning that you don't have to talk out to an RDBMS to do any of these things. In fact the only use I have heard of for our ODBC capability is one client who wanted to access a personnel database for authentication purposes (it had nothing to with the database server per se). I guess this all boils down to what's in a name. At the end of the day, it is far more important to know what a product does and does not do than what you call it. > This isn't to knock SIM; anyone who needs a flexible information > retrieval system should be taking a very serious look at it. The > Z39.50 support alone puts it way ahead of the market as far as I'm > concerned. But I don't think SIM is evidence that there are DBMS > systems that handle SGML/XML well; I don't think they do. Oracle > may very well be getting there with its latest release, but I > suspect there's still a lot of work to be done there. I am sceptical that any RDBMS vendor can come to the party in terms of performance. Past attempts to try to force text into a relational, table or object based paradigm have not reaped great success (Oracle's ConText comes to mind as an example of how forcing a square peg into a round hole requires sacrificing the edges of performance). I would be surprised if any of the major database vendors would be prepared to venture away from their core competency (the relational model) to address the performance issues. But why parse XML to split it up into tables when you can store the XML directly? Why build thousands of index entries to system generated element ID's so that you can do join's to build up an XML fragment, when you can build a single index and pull the fragment in its entirety out of the document from which it comes? Why use inferior content indexing technology taking up to 10 to 20 times the size of the data being indexed when you can use compressed inverted files which take between 15% (document level index) and 50% (multi-level word position index) the size of the data? And all this with faster update speed than many standard text retrieval systems. There is an additional overhead in the relational paradigm which has nothing to do with transactions, concurrency control, or referential integrity checking. That cost is that relational tables do not map cleanly onto hierarchical documents (or data collections to pick up on another thread). Every fragment you insert, update, or remove has to be taken apart to map it onto some underlying representation, modified piece by piece, and then reassembled to be delivered. I strongly disagree that SIM doesn't handle SGML/XML well. In the five years of successfully selling SIM, no customer has ever replaced SIM with another product. In fact none of them have even mentioned to us that they ever considered replacing SIM. This in itself is remarkable given that, because our customers use SIM to store their SGML/XML natively, they can get the data out of SIM much more easily than if it were mapped onto some proprietary internal database format. People buy SIM because it is flexible enough to do whatever they need to do with their XML/SGML. It doesn't force them to adopt a non-XML/SGML approach. It doesn't force them to translate their data into some proprietary format in order to interact with the data. It deals directly with the XML. Precisely what the original post was asking for, in fact. Cheers, Marcelo P.S.: Some thanks go to my colleague, Tim Arnold-Moore, for providing some of the content (including the closing) for this article. -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Mar 6 11:31:29 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au> References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <19990306154022.B22308@io.mds.rmit.edu.au> Message-ID: <14049.4226.895273.99370@localhost.localdomain> Marcelo Cantos writes: > More important than any specifics, however, is the issue of what you > call a DBMS. To me, a DBMS is a database management system (seems > painfully obvious, but I think it bears repeating). You may argue > that a product is not a DBMS if it does not support feature X [...] A DBMS is something that manages data *and* passes the ACID test (Atomicity, Consistency, Isolation and Durability). This isn't a question of "I want feature X" -- the ACID test is what distinguishes a DBMS from, say, the Unix file system (which can also manage data). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Sat Mar 6 15:09:50 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <19990306154022.B22308@io.mds.rmit.edu.au> <14049.4226.895273.99370@localhost.localdomain> Message-ID: <36E14530.10423DC8@fiduciary.com> David Megginson wrote: > A DBMS is something that manages data *and* passes the ACID test > (Atomicity, Consistency, Isolation and Durability). This isn't a > question of "I want feature X" -- the ACID test is what distinguishes > a DBMS from, say, the Unix file system (which can also manage data). I am going to be the old fogey here, with experience of databases going back to IMS and R: ACID is (one possible) test of a transaction processor, not of a database. It was precisely the misguided emphasis upon ACID qualities which bloated the relational model into the transaction-oriented behemoths sold today. For at least ten years we have tried to undo that direction by re-imagining the original relational concept as the data warehouse and, when that too became too bloated, the data mart. There is an opportunity with a true XML database to describe, and implement, transactions without surrendering to the siren song of two-phase commit. The key is understanding that there is no obvious or natural boundary to a transaction. Because of the inherent differences in the perspective of every participant to a transaction, each or them will describe a different set of elements to the transaction and different specific relationships among them. In the data world there is no omniscience which sees the transaction whole: to imagine it as a single, identifiably boundable unit is to deprecate the central task of each participant--to construct a transaction which is understandable to and processable by his own system. That is an ongoing implementational task, not just a conceptual one. In the real world it resolves to this: how do I get what I have to become what you need? What I have and what you need are both structures, and the two of them will incorporate some set of similar or analogous elements, which gives them the common terms on which they can define and communicate the transaction which they are attempting to execute. The definition and the maintenance of each of these structures is the role of the database. Yet each of those structures is peculiarly unique, and both are ephemeral in the specific terms of the transaction which they facilitate. Yes, the transaction, once executed, endures. But the terms in which that durability is communicated--indeed the very substance as which it is preserved--may be utterly different in the systems (and, I would hope, in the databases) of each of the participants. Precisely what each of those systems, or databases, does not exhibit are the ACID qualities through which some would hope to define the identity, uniqueness and permanence of that transaction. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Sat Mar 6 17:19:49 1999 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:09:42 2004 Subject: Opinions requested References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com> <19990306153959.A22308@io.mds.rmit.edu.au> Message-ID: <36E1639D.FDA85E9C@fiduciary.com> Marcelo Cantos wrote: > Thank you, Walter for the erudite response. I am left in a bit of > quandary as to how or even whether to respond. This is in large part > due to the fact that, while your post was in response to mine, it is > not immediately clear to me whether you are addressing my comments > specifically or rather the general theme of this thread. Thank you for your kind words. I will confess that much of my post was addressed to the general theme of the thread. > On this point, I must make it quite clear that SIM is _not_ an XML > front end to a data store. It is an XML (etc.) document repository. My naive reading of the SIM materials on your website leads me to this conclusion. I am glad to have your confirmation of it. As a document repository SIM may more nearly compete with the 'grove minder' paradigm than with what I characterize as an XML database. > One additional, crucial point is that SIM _is_ extensible (though I > will qualify this presently). It can be defined to accept markup to > any degree of strictness or laxity (within the bounds of > well-formedness or validity, of course). It can be setup to accept > any and all markup and do _something_ intelligent with it. It can > also be configured to make stringent demands (well in excess of the > DTD, both with respect to strictness and complexity of constraints) of > its inputs. Granted. It is simply that I (perhaps perversely) have defined an XML database engine as one which implements XML markup. My XML database engine is driven by the markup and must rework the effective schema and re-cast its processing behavior in sync with changes to the document instance markup. > Now, by way of qualification, SIM does not provide free-form runtime > extensibility (runtime from the administrator's perspective, not > ours). Rather it provides the application developer with the > requisite tools to define, at design time, what structures will be > supported. For instance, you cannot, with SIM, perform queries such > as, "find me all sections containing subsections with an attribute of > security="public" and at least one paragraph with fewer than four > words in it" The semantic complexity of such a query is beyond the > scope of our product. However, if one were to know in advance that > queries about the minimum paragraph length in public subsections will > be commonplace in the particular application one is developing, then > SIM could, at design time, be told to create an appropriate index and > then the above query could, indeed, be performed. > > In short, SIM _is_ extensible, but the extensibility is bound somewhat > earlier than runtime. In practice, clients never complain about this > quality. In fact, it is usually a benefit rather than a hindrance, > for the same reason that compile time type checking is a good thing to have in a programming > language. All of these are commendable design decisions. They are not, IMHO, realizations of the unique qualities and potential of XML. On that, reasonable people may differ. > I also take issue with Walter's remark that an XML database should be > manipulated by and defined through the medium of XML. This sounds > analogous to suggesting that relational databases should be defined > and manipulated by markup. No, by relational schema, as you acknowledge in the next line. > Now, it is true that relational schema > are, themselves, typically stored as relations (one will, for example, > find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc. > inside a database). However, it seems to me patently absurd to > suggest that SQL (whether DML or DDL) be expressed in terms of tuples > and relations. Now, while it does not seem likewise absurd to suggest > that XML queries and data definition constructs be defined as XML, the > truth of such a suggestion is anything but self-evident. Why should > one not use an SQL-like language to define and query XML databases? > There may or may not be merit in such an approach, but it seems no > more or less appropriate than a query/data definition language cast in > XML. Indeed, many of the query language position papers at W3C do not > use XML syntax. Data definition and query languages are > meta-constructs. They are not part of the data, but rather operate on > the data and structures. This suggests that while it may be possible > to fold the system in on itself by expressing meta-structure as data, > it would be unwise to proceed down this path in _a priori_ fashion By following the path indicated by just such an a priori judgment I arrived at the conclusions which I have shared with you. I am implementing the resulting design and, I suppose, the almighty market will render the final verdict. > The serious user of XML does not have a heterogeneous collection of > vaguely defined documents with a motley crew of DTD's and well-formed > markup. That is exactly what I (and my customers, once we re-state their documents in various legacy forms as XML) have to deal with. We process settlements of cross-border trades and the regulatory reporting required by multiple overlapping legal jurisdictions. If I have advice of a trade execution in the customary form used in, say, Djakarta, and the interested parties to whom I must report it are a UK fiduciary, a Swiss depot bank, a US money manager and a Hong Kong broker, as well as the various regulators which the involvement of each of those parties entails, I must (in my opinion) drive the entire process off of a properly marked up document which succinctly expresses the facts of the transaction reported. That document, received by each of the interested parties, must be instantiated in the system--and I would hope the database--of each in a form which may well require re-writing the schema upon which it will be realized. > Most users have a well defined data set for which they want > to define efficient structures for storage and retrieval (if they > aren't interested in efficiency then their problem isn't particularly > interesting -- any tool will do). In the few cases where they do have > arbitrary structure to deal with, more often than not they are only > interested in the content and are likely to throw the structure away. As I hope the use case fragment above illustrates, users may have very well defined structures, well-suited to their specific needs. Those structures, however, may not accommodate the instance documents which they receive as input data and which, in the real-world examples I am familiar with, may exhibit differences of data structure on each occasion. Respectfully, Walter Perry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 6 20:47:45 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: <003b01be6811$cc5974e0$c9a8a8c0@thing2> Seems like a good fit for filters--drop what you don't want, transform the rest as needed. Bill -----Original Message----- From: David Brownell To: MikeDacon@aol.com Cc: xml-dev@ic.ac.uk Date: Friday, March 05, 1999 5:58 PM Subject: Re: ModSax Suggestion >Interesting suggestion for a big hole in the parts of >the Java API set that are more or less "standard" at >this poit -- SAX and DOM. > >One comment though: I've found that it's important to >be able to have options controlling how the DOM tree is >built. For example, whether to discard ignorable spaces, >or do namespace conformance enforcement, or try to get >CDATA sections (comments, etc). > >Accordingly, I think being able to do a bit more than >this will be important. > >- Dave > > > >MikeDacon@aol.com wrote: >> >> Hi Everyone, >> >> While SAX does a good job as an event-based interface >> to Parsers, it would be nice to add a few methods to >> receive a DOM representation back from a reference to an org.xml.sax.Parser. >> >> Something like: >> >> org.w3c.dom.Document parse(InputSource is, boolean events) throws >> SAXException; >> org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws >> SAXException; >> /* the events boolean would be to turn on/off event calls. */ >> >> If a SAXDriver did not want to produce a DOM, it could either simply >> return a null or a method added like: >> >> boolean isDomCapable(); >> >> The above would let me use the ParserFactory to seamlessly switch >> between Parser implementations and get a DOM tree without building >> one myself. It is fruitless for me to build a DOM tree when almost all >> the parser implementations provide that ability. I just want a way to get >> at that functionality in a simple and standard way (thus SAX). >> >> Thoughts? >> >> - Mike >> ----------------------------------------------- >> Michael C. Daconta >> Author of Java 2 and JavaScript for C/C++ Programmers >> Author of C++ Pointers and Dynamic Memory Management >> Sun Certified Java Programmer and Developer >> http://www.gosynergy.com >> >> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >> (un)subscribe xml-dev >> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >> subscribe xml-dev-digest >> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Mar 6 20:57:14 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:42 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: <004801be6813$22d36820$c9a8a8c0@thing2> From: Daniel Veillard > Hum, I have been following the streaming/fragment thread. However I have >the feeling that even multiple fragment body extensions would not solve >the problem you were facing. If I didn't get the discussion wrong, it seems >that you rather tried to make one very big (i.e. stream) document from >multiple sources while the scope of the fragment work was just the opposite, >i.e. how to extract and ship a piece of a very big document. Actually, it sounds to me like the seperation of physical and logical layers. On the one hand, I have some data to move. Multiple documents, multiple fragements, whatever. (logical) On the other hand, I have a stream. It can pass any number of documents or fragments. (physical) The fragments in the stream could be all from one document or from different queries on different documents or from one query applied to a set of documents. It shouldn't matter. And how one might reassemble fragments back into a large document is another problem, though the stream should provide sufficient information to do so. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Mar 7 10:38:43 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:42 2004 Subject: New expat test release and FAQ Message-ID: <36E253C5.E4301749@jclark.com> A new expat test release is available at: ftp://ftp.jclark.com/pub/test/expat.zip This adds handlers for namespace declarations; when namespace processing is enabled these provide information about xmlns attributes. This release also fixes a few bugs. I've also started an expat FAQ at: http://www.jclark.com/xml/expatfaq.html Suggestions for additions are welcome. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Mar 7 13:02:43 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: Hi Dave, In a message dated 3/5/99 5:40:50 PM Eastern Standard Time, db@eng.sun.com writes: > Interesting suggestion for a big hole in the parts of > the Java API set that are more or less "standard" at > this poit -- SAX and DOM. > > One comment though: I've found that it's important to > be able to have options controlling how the DOM tree is > built. For example, whether to discard ignorable spaces, > or do namespace conformance enforcement, or try to get > CDATA sections (comments, etc). > I agree with that. I think all that is possible while still retaining a minimalist design philosophy. Something like: void setDOMFeature(String feature, boolean val); boolean get DOMFeature(String feature); That way via an extensible common set of text properties we can add properties as the need arises without expanding the API. Looking forward to progress on the Java XML API. BTW, Dave, are you going to do a "Birds of a Feather" session on XML at this years JavaOne? I think that could be valuable. Best wishes, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Mar 7 13:15:25 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:42 2004 Subject: ModSax Suggestion Message-ID: <4d147c5b.36e27b77@aol.com> In a message dated 3/6/99 4:03:25 PM Eastern Standard Time, b.laforge@jxml.com writes: > Seems like a good fit for filters--drop what you don't > want, transform the rest as needed. > I think Bill has brought up an excellent point. In fact, I like that suggestion better than my setFeature() method. It seems to me that the central tension of API design is whether to expand the API or relegate functionality to be handled by a higher-level layer of software. In my original suggestion, on getting access to a DOM it seems appropriate that be part of SAX (a low-layer) while transforming the resultant tree be relegated to a higher level layer. While I have certainly written gobs of enterprise level software, my experience with formal APIs is limited -- does this track with those of you with more API building experience? Best wishes, - Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sun Mar 7 16:10:02 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:42 2004 Subject: XML MULTI-Fragment Interchange? Message-ID: Bill wrote: > From: Daniel Veillard > > Hum, I have been following the streaming/fragment thread. > > However I have the feeling that even multiple fragment body > > extensions would not solve the problem you were facing. If > > I didn't get the discussion wrong, it seems that you rather > > tried to make one very big (i.e. stream) document from > > multiple sources while the scope of the fragment work was > > just the opposite, i.e. how to extract and ship a piece of > > a very big document. > [snip] > And how one might reassemble fragments back into a large > document is another > problem, though the stream should provide sufficient > information to do so. I think Daniel's point is simply that in many situations you may not want to reconstruct the 'large' document. The fragment work seems to relate to providing context to a fragment, such that reasonable work can be done on it. That's not the same - although related - as shipping one great big document in a number of packages. On the theme of multi-fragments, I think the simplest increment from where we are now is to allow for the results set of a query that spans different levels of a tree. I was previously exporting from queries using a simple wrapper, but when I saw the fragment group's work decided to use it with a very slight modification. The change is an obvious one - and I think someone else suggested it on this list the other day - but I wonder if anyone can see any pitfalls. I've enclosed four sets of query results for those who might be interested in approving/criticising my approach. The queries are: http://[server]/documents/ysArticle[author=Ruth] http://[server]/documents/ysArticle[author=Ruth]/ArticleText http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText[ID=1 ] [Ignore non-quoted stuff, etc., it's still work in progress!] Although the first few actually return pretty much the same information, they differ in where the division between context and requested data is. The first will return all articles by Ruth in their entirety, and so only needs one 'fragbody' element. The second returns the same data, but the articles themselves are now provided only as context, and the containers of the text become the top level of the fragments. This therefore requires two 'fragbody' elements, since there are two articles by Ruth. (Actually it could be one, but because there's an article between the two that is *not* by Ruth, even though it's not getting returned it messes up my merging code!) The third query is not much different from number two, but pushes one more level of data up into the 'context' information. The final query is the one I'm most interested in getting feedback on, in particular on whether I have the context information right. I think the fragment document is a little ambiguous on what level of detail to put in. Some examples in the doc. do what I have done - put in all siblings of any element that is an ancestor of the ones we're interested in - but one of them doesn't. Of course it is partly application-dependent so I'm not that bothered. Comments? Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net -------------- next part -------------- A non-text attachment was scrubbed... Name: q1.xml Type: application/octet-stream Size: 1565 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q1.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q2.xml Type: application/octet-stream Size: 956 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q2.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q3.xml Type: application/octet-stream Size: 1277 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q3.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: q4.xml Type: application/octet-stream Size: 3920 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q4.obj From david at megginson.com Sun Mar 7 23:57:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:42 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <14051.3215.196642.22571@localhost.localdomain> What: Four proposed predefined features for ModSAX Action: Please read and comment (especially to propose core features I've missed) Last month, I posted a proposal [1] for a backwards-compatible SAX layer called ModSAX, which will allow parser and filter writers to extend SAX and application writers to discover what extensions exist, all in a well-defined and predictable way. The relevant part of that interface for this posting is the following method in ModParser (which extends org.xml.sax.Parser): public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; The value of featureID will in some way piggyback on DNS, either by using URIs or by using names similar to Java packages. Although people will be allowed (and encouraged) to invent their own features, I'd like to predefine a core set of features for the next SAX release. Here's what I've thought of so far: 1. http://xml.org/sax/features/validation True means validate, false means don't validate. 2. http://xml.org/sax/features/external-entities True means expand external text entities, false means don't expand external text entities. 3. http://xml.org/sax/features/namespaces True means perform namespace processing -- munge element and attribute names and remove namespace declaration attributes -- and false means don't perform namespace processing. 4. http://xml.org/sax/features/unbuffered-input True means ensure that the parser does not buffer input from a Reader or InputStream supplied by the application (actually, one-character look-ahead will usually be required); false means do not ensure that the parser does not buffer input. This feature might be useful for reading multiple documents from a single stream. No SAX parsers will be *required* to support any of these -- they can simply throw a SAXNotSupportedException for any request (as they should for any other unrecognised feature request). The earliest ModSAX parser will probably be a general-purpose SAX 1.0 Parser adapter, and that will certainly not be able to do anything useful with these. Unlike parsers, filters will ordinarily pass unrecognised feature requests on up the chain of responsibility. Examples -------- If an application wants to ensure that the SAX parser is performing validation, it can use try { parser.setFeature("http://xml.org/sax/features/validation", true); } catch (SAXNotSupportedException e) { // ... } The parser may throw an exception for either of two reasons: 1. it cannot validation; or 2. it does not recognise the property. If the application wants to determine which of the two is the case, then it can try the following: try { parser.setFeature("http://xml.org/sax/features/validation", false); } catch (SAXNotSupportedException e) { // ... } If the parser throws an exception again, then it does not recognise the property name (in other words, it may or may not perform validation, and the application has no way to tell); if the parser does not throw and exception, then it simply does not support validation. [1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 8 02:03:56 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E32900.BBDF43C0@jclark.com> David Megginson wrote: > 2. http://xml.org/sax/features/external-entities > True means expand external text entities, false means don't expand > external text entities. I would suggest distinguishing the expansion of external parameter entities (which would include the external DTD subset) from the expansion of external general entities. I can easily imagine wanting to expand external general entities declared in the internal subset, but not wanting to read an external DTD. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 8 02:04:22 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E329F5.76A50E09@jclark.com> David Megginson wrote: > The parser may throw an exception for either of two reasons: > > 1. it cannot validation; or > > 2. it does not recognise the property. > > If the application wants to determine which of the two is the case, > then it can try the following: > > try { > parser.setFeature("http://xml.org/sax/features/validation", false); > } catch (SAXNotSupportedException e) { > // ... > } > > If the parser throws an exception again, then it does not recognise > the property name (in other words, it may or may not perform > validation, and the application has no way to tell); if the parser > does not throw and exception, then it simply does not support > validation. Wouldn't it be simpler to throw different type of exception in these two cases? You could have a SAXNotRecognizedException that extends SAXNotSupportedException, and say that parsers should throw SAXNotRecognizedException when the reason they don't support a feature is that they do not recognize the feature. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 02:32:27 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <128b4bc2.36e3366b@aol.com> Hi Dave, Before responding to your specific proposal ... I do not understand why you are creating a new interface like ModParser instead of just evolving the Parser interface itself. Personally, while I know full well what it would mean to implement Parser -- a "ModParser" is just plain confusing. Five years from now, someone should not have to know the history of SAX to understand the terminology. Now to the Predefined features... In a message dated 3/7/99 7:13:19 PM Eastern Standard Time, david@megginson.com writes: > What: Four proposed predefined features for ModSAX > Action: Please read and comment (especially to propose core features > I've missed) > > Last month, I posted a proposal [1] for a backwards-compatible SAX > layer called ModSAX, which will allow parser and filter writers to > extend SAX and application writers to discover what extensions exist, > all in a well-defined and predictable way. I like the idea of SAX filters but still feel that you should allow access to a DOM Document if the implementing Parser can supply one. I won't restate the suggestion here as it was covered in a previous email. However; that could greatly simplify a filter-writer's job. > > The relevant part of that interface for this posting is the following > method in ModParser (which extends org.xml.sax.Parser): > > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > The value of featureID will in some way piggyback on DNS, either by > using URIs or by using names similar to Java packages. Although > people will be allowed (and encouraged) to invent their own features, > I'd like to predefine a core set of features for the next SAX > release. Here's what I've thought of so far: Since some finite set of SAX features will not approach a global naming problem, I strongly urge not to use a URI. If a package name scheme is to be used, something like "sax.feature.validation". It would also be nice to provide one word String constants for the standard features. Best wishes, - Mike Daconta (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 02:56:51 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <001e01be690e$8913fe00$c9a8a8c0@thing2> From: MikeDacon@aol.com >I like the idea of SAX filters but still feel that you should allow >access to a DOM Document if the implementing Parser can supply one. >I won't restate the suggestion here as it was covered in a previous email. >However; that could greatly simplify a filter-writer's job. Well, that might depend on the job of the filter. You may want to use a filter to prune out the parts of the document you are not interested in BEFORE the DOM is built. In general, I see several places where you might want to use a filter: o Transform events from a parser into something to be output. o Transform events from a parser before being accessed by an application. o Between a parser and the DOM. o Transform events from a DOM walker into something to be output. Note that in the last case, if the DOM walker shares its internal state (position in the DOM tree) with the filters that come after it (using something like MDSAX), we get a lot of XSL-like capabilities. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 04:39:57 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <004b01be691c$f348fc40$c9a8a8c0@thing2> David, I am very much inclined to agree with you that the conservative approach taken in implementing SAX was necessary to its broad acceptance at that time. However, broad acceptance of a SAX upgrade may require a different approach. For one thing, the very success of SAX has itself changed things. The primary requirement is backward compatibility for both parsers and applications. A second requirement is that the upgrade not be conservative, but that it be a significant enhancement from a wide range of perspectives. The upgrade needs to be worth doing, but for more than one reason. Feature negotiation alone is not quite enough. I'm sure you know the kinds of things I'm looking for: o Event objects for one. o A way to specify a filter to a DOM-building-parser is another. o Better integration with the DOM in general. I'm sure others have their own feature list. We need to define a collection of new capabilities that have wide appeal, together with an implementation strategy which provides full backward compatibility. And for this group, it needs to be something that can be implemented cleanly. I still feel like a newbie here. I wasn't here when SAX was done. But I would hate to see the initiative lost to the traditional standards bodies. As I see it, there are two advantages to doing the work on this list: 1. It is open to individuals. The cost to participate is measured only in the time it takes. 2. This is the world's toughest bunch of critics. The folks here plan to implement the proposals themselves. And any proposal that isn't clean is going to be revised until it can be easily implemented. And as much as the first point is what allows me to participate, it is the second point that is the real winner. A standards body whose participants are largely from large companies have more to gain from a spec that is difficult to implement--it limits the competition. So that's why I'm butting in here. I think an open standards process is important for individuals and small companies. We need to do what we can to keep the ball rolling here. Bill From: David Megginson >What: Four proposed predefined features for ModSAX >Action: Please read and comment (especially to propose core features > I've missed) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shinichiro.hamada at toshiba.co.jp Mon Mar 8 06:57:59 1999 From: shinichiro.hamada at toshiba.co.jp (Shinichiro HAMADA) Date: Mon Jun 7 17:09:43 2004 Subject: Accessing DTD info. in IE5 Message-ID: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp> Hello. >Is DTD information accessable through IE5 DOM? I took is as granted because >I could do it with old MSXML for java used in IE4. However, when I really >wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD >information exposed in IE5 DOM? I wonder if what you want to know is IXMLDOMDocument::get_doctype: http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a sp or I've misunderstood your question? -- Shinichiro HAMADA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From johnh at erin.gov.au Mon Mar 8 06:58:56 1999 From: johnh at erin.gov.au (John Hockaday) Date: Mon Jun 7 17:09:43 2004 Subject: Mapping elements in architectural forms Message-ID: <199903080655.RAA21026@eos.erin.gov.au> Hi, I am using architectural forms to map elements from a client document instance of a client DTD to a base document of a base DTD using the SP software by James Clark. The problem is that the structure of the elements and sub-elements in the client document do not exactly match the base DTDs elements and sub-elements and I don't know how to relate this in the mapping DTD. For example, sub-elements "b" and "c" occur in element "a" in the client DTD but in the base DTD sub-elements "b" occur in element "a" but sub-element "c" occurs in element "d". Client Base ====== ==== If I map "a" to "a", "b" to "b" and "c" to "c" in the mapping DTD the parser gives an error that "a" has not been finished and that "c" should not occur here in the base document. Does anyone know how I can map the client elements to the base elements in the mapping DTD to fix this problem? ___________________________________________________________________________ John Hockaday - Systems Officer GPO Box 787 email: johnh@erin.gov.au Canberra ACT 2601 phone: +61 2 6274 1173 fax: +61 2 6274 1333 Australia URL:http://www.environment.gov.au/ ERIN Environmental Resources Information Network ERIN ___________________________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wendy.cameron at qr.com.au Mon Mar 8 07:25:52 1999 From: wendy.cameron at qr.com.au (Wendy Cameron) Date: Mon Jun 7 17:09:43 2004 Subject: XSL Problem References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <00f101be6930$a123feb0$c62b580a@qrail.com.au> Ok I have I am trying to select all 3 nodes and orger by att1 but display different information depending on what type of node it is? Does anyone have any idea how i would do this I have tried ..... But this doesnt test if the current node is of type nodeType1 Help!!! Regards Wendy xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From zmin at atpage.com Mon Mar 8 07:28:47 1999 From: zmin at atpage.com (min zheng) Date: Mon Jun 7 17:09:43 2004 Subject: Accessing DTD info. in IE5 References: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp> Message-ID: <002d01be6935$e73a66a0$f66f6f0a@atpage> What I want is the DTD (or Schema) rules telling me what nodes are allowed in an element. The get_doctype mothod only gives the doctype declaration. There is no way (as far as I know) to access element rules from there. Thanks anyway, Min ----- Original Message ----- From: Shinichiro HAMADA To: Sent: Sunday, March 07, 1999 10:56 PM Subject: RE: Accessing DTD info. in IE5 > Hello. > > >Is DTD information accessable through IE5 DOM? I took is as granted because > >I could do it with old MSXML for java used in IE4. However, when I really > >wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD > >information exposed in IE5 DOM? > > I wonder if what you want to know is IXMLDOMDocument::get_doctype: > > http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a > sp > > or I've misunderstood your question? > > -- > Shinichiro HAMADA > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 10:29:39 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <14051.3215.196642.22571@localhost.localdomain> References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: * David Megginson | | The value of featureID will in some way piggyback on DNS, either by | using URIs or by using names similar to Java packages. I think we should use package-like names. Using protocol prefixes seems to me both potentially confusing, slightly obfuscating and I don't see the merit in it over a package-like scheme. I much prefer org.xml.sax.features.validation over http://xml.org/sax/features/validation. | 2. http://xml.org/sax/features/external-entities I agree with James that separating general entities and parameter entities is a good idea. | 4. http://xml.org/sax/features/unbuffered-input I'm not sure I see the merit of this. Maybe we should skip this? A suggestion of my own: org.xml.sax.features.catalog True means read the default catalog file, whether that is located via an environment variable, a Java property or something else. OpenXML, XML Parser for Java (xml4j) and xmlproc already support catalogs, and might find this useful. xmlproc certainly will. | No SAX parsers will be *required* to support any of these -- they | can simply throw a SAXNotSupportedException for any request I also agree with James that a separate unrecognized-exception is a good idea. | Unlike parsers, filters will ordinarily pass unrecognised feature | requests on up the chain of responsibility. Good point. This implies that filters need references in both directions, that is, both to the event source and to the event receiver, thus resolving a question that was previously discussed here. | [1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html Hmmm. Wouldn't this reference be more correct? --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 10:40:23 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2> References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: * Bill la Forge | | The upgrade needs to be worth doing, but for more than one reason. I agree that it needs to be worth doing, but to me what has been proposed here certainly sounds like it is enough. (Remember, parameter setting, handler extensibility, filters, namespaces, lexical information and DTD information are probably all in the pipeline.) | I'm sure you know the kinds of things I'm looking for: | o Event objects for one. On this point I agree with what David will probably say: this belongs on a higher level. If you want this functionality, make a value-adding layer on top of SAX 1.1. There's no loss in that, since you can implement this once for all SAX-aware parsers with hardly any performance penalties. (This is why I agree with David: this is the kind of benefit that being ultra low-level buys us.) | o A way to specify a filter to a DOM-building-parser is another. We certainly need this, but I don't see how this can usefully be part of SAX. SAX is at a lower level than the DOM and so should certainly be designed for a DOM layer to fit nicely on top, but there should be no dependencies, I think. In other words, this is something that either the DOM or the parsers will have to deal with in a sensible fashion. Taking a ModParser as an argument to DOM building would perhaps be the best way to do this. However, I don't see the harm in someone sitting down to write a recommendation to DOM parser writers for how to do this and why it's useful. | So that's why I'm butting in here. I think an open standards process | is important for individuals and small companies. We need to do what | we can to keep the ball rolling here. We are certainly in heartfelt agreement here. :) --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:30:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <128b4bc2.36e3366b@aol.com> References: <128b4bc2.36e3366b@aol.com> Message-ID: <14051.45935.800104.922834@localhost.localdomain> MikeDacon@aol.com writes: > I like the idea of SAX filters but still feel that you should allow > access to a DOM Document if the implementing Parser can supply one. > I won't restate the suggestion here as it was covered in a previous > email. However; that could greatly simplify a filter-writer's job. I have an idea for how we can handle that (and other, similar problems), but I'll cover it in a separate posting (it's still brewing a bit). > Since some finite set of SAX features will not approach a global naming > problem, I strongly urge not to use a URI. I disagree here -- if third parties want to be able to define feature names, they need a way to avoid collision (i.e. we want to make certain that both Oracle and Sun can define properties like 'normalize' without blowing up the whole system). That said, the Java package naming scheme also provides DNS-based uniqueness, as in 'org.xml.sax.features.validation'. It's simply a matter of taste: - org.xml.sax.features.validation is more of a Java flavour. - http://xml.org/sax/features/validation is more of an XML/Namespaces flavour All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:34:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2> References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <14051.46235.905949.308401@localhost.localdomain> Bill la Forge writes: > The upgrade needs to be worth doing, but for more than one > reason. Feature negotiation alone is not quite enough. Yes, but my original proposal was not limited to feature negotiation -- it also included the ability to add and negotiate new handler types at runtime. People will upgrade because they want to use the new handlers that are implemented with ModSAX, not because of any elegance or inelegance in ModSAX itself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Mon Mar 8 11:41:43 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> > What: Four proposed predefined features for ModSAX > Action: Please read and comment (especially to propose core features > I've missed) > Could I add a plea for another optional feature: http://xml.org/sax/features/normalisePCDATA whose effect is to ensure that successive calls to supply character data are combined into a single call. The reason for this is that it's very common for applications to assume the parser won't split character data, an incorrect assumption but one that will survive most testing. Actually I think using "http://" names for things that have nothing to do with HTTP protocol is very bad form. (Apart from anything else, my mail client encourages my to click on them to see what's there.) "org.xml.sax.features.normalisePCDATA" is much more sensible. If you want a URN, choose a protocol name other than http. Another rather trivial convenience feature I'd like added to SAX is the ability for InputSource to accept a File (as well as a URL, etc). Though the need for this has declined with Java 2. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:42:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E32900.BBDF43C0@jclark.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E32900.BBDF43C0@jclark.com> Message-ID: <14051.46547.366706.485764@localhost.localdomain> James Clark writes: > I would suggest distinguishing the expansion of external parameter > entities (which would include the external DTD subset) from the > expansion of external general entities. I can easily imagine > wanting to expand external general entities declared in the > internal subset, but not wanting to read an external DTD. I agree. Here's the new core feature list: http://xml.org/sax/features/validation http://xml.org/sax/features/external-general-entities http://xml.org/sax/features/external-parameter-entities http://xml.org/sax/features/namespaces http://xml.org/sax/features/unbuffered-input All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:43:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:43 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E329F5.76A50E09@jclark.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E329F5.76A50E09@jclark.com> Message-ID: <14051.46946.186235.431488@localhost.localdomain> James Clark writes: > Wouldn't it be simpler to throw different type of exception in these two > cases? You could have a SAXNotRecognizedException that extends > SAXNotSupportedException, and say that parsers should throw > SAXNotRecognizedException when the reason they don't support a feature > is that they do not recognize the feature. Yes, I agree. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:51:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <14051.46670.687235.664451@localhost.localdomain> What: Additions to ModParser interface I'm proposing a couple of additions to the ModParser interface: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; public abstract void set (String infoID, Object prop) throws SAXNotSupportedException; public abstract Object get (String infoID) throws SAXNotSupportedException; } These allow you to do interesting things like parser.set("http://www.foo.com/props/textfilter", filter); or try { Node node = parser.get("http://xml.org/sax/props/dom-node"); } catch (SAXNotRecognizedException e1) { // doesn't know about DOM processing... } catch (SAXNotSupportedException e2) { // knows about DOM processing, but not doing it... } Again, it's a little sloppy as an interface, but it's beautifully extensible and it supports filters nicely (if there are other filters between the DOM iterator and the application, it will still work). Note that strictly speaking, now, setHandler() and setFeature() are no longer primitives, since they could both be implemented in terms of set(), but I think that the extra type checking is worthwhile in those cases. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 11:54:40 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> References: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk> Message-ID: <14051.47534.65569.354415@localhost.localdomain> Kay Michael writes: > Could I add a plea for another optional feature: > http://xml.org/sax/features/normalisePCDATA Yes, this is especially useful for building a DOM as well. I've added it to the list of core features: http://xml.org/sax/features/validation http://xml.org/sax/features/external-general-entities http://xml.org/sax/features/external-parameter-entities http://xml.org/sax/features/namespaces http://xml.org/sax/features/unbuffered-input http://xml.org/sax/features/normalize-text Remember that parser will not be required to support any of these. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Mon Mar 8 12:33:14 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <073f01be695f$d8dc04e0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 07 March 1999 23:56 Subject: SAX RFD: ModSAX Predefined Features >What: Four proposed predefined features for ModSAX >Action: Please read and comment (especially to propose core features > I've missed) > >Last month, I posted a proposal [1] for a backwards-compatible SAX >layer called ModSAX, which will allow parser and filter writers to >extend SAX and application writers to discover what extensions exist, >all in a well-defined and predictable way. It seems to me that there are two kinds of parser extensions: 1/ those that are static (i.e. must be established before the parser is used) 2/ those that are dynamic (i.e. they can be changed on the fly) An example of a static extension would be buffering. If the parser is buffering input then it is infeasible to change to unbuffered input in the middle of parsing the text. Switching from non validating to validating is problematic, insisting that a parser be able to do this would probably add unacceptable overhead to the non validating mode. I would suggest that the bulk of the extensions should be specified to the parserFactory and only a *very* limited number (if any at all) be specified to the instance of Parser. I would very much like a getFeature function which returns a value telling me if the feature is set or not. I'm also not very keen on the use of strings to specify the features. How about using instances of classes: in org.xml.sax public abstract class Feature { public Feature(boolean state) { this.state = state; } final boolean state; } public final class Validation extends Feature { public Validation(boolean state) { super(state); } } individual parser implementations would then be free to add their own extensions defined by classes that subclass org.xml.Feature - they could also contain parameters. setFeature would then take a single Feature parameter: xxx.setFeature(new org.xml.sax.Validation(true)); getFeature would take a Class parameter and return an instance of the class or null if the feature was unrecognised. org.xml.sax.Feature f = xxx.getFeature(org.xml.sax.Validation.class); if (f == null) // not supported if (f.state) // supported and switched on. non Java implementations would probably have to use a string instead of the Class parameter. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 13:13:13 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <14051.45935.800104.922834@localhost.localdomain> Message-ID: On Mon, 8 Mar 1999, David Megginson wrote: > MikeDacon@aol.com writes: > > Since some finite set of SAX features will not approach a global naming > > problem, I strongly urge not to use a URI. > > I disagree here -- if third parties want to be able to define feature > names, they need a way to avoid collision (i.e. we want to make > certain that both Oracle and Sun can define properties like > 'normalize' without blowing up the whole system). > > That said, the Java package naming scheme also provides DNS-based > uniqueness, as in 'org.xml.sax.features.validation'. It's simply a > matter of taste: > > - org.xml.sax.features.validation is more of a Java flavour. Yep... but might not feel so natural for developers working with versions of SAX translated for Perl, Python and so on. > - http://xml.org/sax/features/validation is more of an XML/Namespaces > flavour ...and RDF [1]. Giving interesting entities URIs makes them more fully a part of the Web, and means we can take advantage of URI-oriented metadata. Eg. you might search a software database for resources that were of type 'Perl Module' and that implemented the feature known as 'http://xml.org/sax/features/validation'. (There's already a Linux Packages Database[2] along similar lines...). I'm not claiming that this would be impossible using the Java naming scheme, just that a Web oriented approach might make it easier to do certain things... Dan [1] http://www.w3.org/TR/REC-rdf-syntax [2] http://rpmfind.net/linux/rpmfind/ -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9288478 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 13:39:47 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:44 2004 Subject: ModSAX addition, general property query Message-ID: <008c01be6967$da059720$c9a8a8c0@thing2> From: David Megginson > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > > public abstract Object get (String infoID) > throws SAXNotSupportedException; David, OK, this is more like it! You have now defined an interface which is broad enough to fit all of MDSAX under. Remember that filters also implement the parser interface. And so do DOMWalkers. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 13:43:01 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: Hi Bill, In a message dated 3/7/99 10:03:55 PM Eastern Standard Time, b.laforge@jxml.com writes: > Well, that might depend on the job of the filter. You may want to use a > filter > to prune out the parts of the document you are not interested in BEFORE > the DOM is built. I agree with you. I was not saying that access to the DOM was the only way to write a filter. Just that filters can be based on walking a DOM Document tree as you state below. > > In general, I see several places where you might want to use a filter: > > o Transform events from a parser into something to be output. > > o Transform events from a parser before being accessed by an > application. > > o Between a parser and the DOM. > > o Transform events from a DOM walker into something to be output. > Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 14:54:21 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <18b603b2.36e3e337@aol.com> Hi David, In a message dated 3/8/99 9:10:40 AM Eastern Standard Time, david@megginson.com writes: > What: Additions to ModParser interface > > I'm proposing a couple of additions to the ModParser interface: > > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > public abstract Object get (String infoID) > throws SAXNotSupportedException; > } > > These allow you to do interesting things like > > parser.set("http://www.foo.com/props/textfilter", filter); > > or > > try { > Node node = parser.get("http://xml.org/sax/props/dom-node"); > } catch (SAXNotRecognizedException e1) { > // doesn't know about DOM processing... > } catch (SAXNotSupportedException e2) { > // knows about DOM processing, but not doing it... > } > I think the success of a general set() and get() capability will be based on the creation of a good initial set of descriptors (what you called infoID) to get or set. So, in that vein, I have 2 comments: 1. I still strongly urge not to use a URI for a feature or infoID. These are not resource locations they are just a descriptive string. In fact, I bet that most parsers just implement your initial recommended set. 2. I'd recommend that constants be defined in the interface for the initial set of standard features and infoIDs. Something like: public static final String VALIDATE = "sax.feature.validation"; public static final String DOCUMENT = "sax.dom.Document"; Then I can do this: try { parser.setFeature(ModParser.VALIDATE, true); } catch (SAXNotRecognizedException e1) { // doesn't know about validation } catch (SAXNotSupportedException e2) { // Does not support validation } Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:05:47 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E3E712.D5556233@locke.ccil.org> David Megginson wrote: > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; I want to propose a restriction and an extension: 1) This method cannot be called after any other parser method has been invoked. 2) This method is allowed to throw a SAXNewParserException, which encapsulates a replacement parser. The application should use the parser inside the exception in place of the original parser. This allows parsers to push filters on top of themselves, which complements the ability of applications to push them. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:09:34 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E32900.BBDF43C0@jclark.com> Message-ID: <36E3E80B.C3E55F16@locke.ccil.org> James Clark scripsit: > I can easily imagine wanting to > expand external general entities declared in the internal subset, but > not wanting to read an external DTD. Or, indeed, the converse: I might want to get the whole DTD but make my own decisions about loading external general entities. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 15:15:47 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:44 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <18b603b2.36e3e337@aol.com> References: <18b603b2.36e3e337@aol.com> Message-ID: <14051.59370.316671.640337@localhost.localdomain> MikeDacon@aol.com writes: > 1. I still strongly urge not to use a URI for a feature or infoID. > These are not resource locations they are just a descriptive > string. In fact, I bet that most parsers just implement your > initial recommended set. Yes, but what about filters that perform specialised actions? And what about adding support (stable or experimental) for new XML-related features like schemas, datatyping, and linking as they become available? The problem with SAX 1.0 is that it froze the XML status quo of about a year ago, and many interesting things have happened since then; with ModSAX, I'd like to leave the API open for two reasons: 1. so that we can extend it without breaking existing implementations; and 2. so that people can experiment with different ways of supporting new features within the SAX framework. As I wrote before, it doesn't much matter whether we use Java property names incorporating domain names (like 'org.xml.sax.features.validation') or URIs (like 'http://xml.org/sax/features/validation'), as long as we have the ability for people to create new names without fear of collision. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 15:18:11 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:44 2004 Subject: SAX RFD: ModSAX Predefined Features References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <36E3E967.F0D6690B@locke.ccil.org> Bill la Forge wrote: > o Event objects for one. But event objects are very easy to build on top of the existing SAX. Just do it! > o A way to specify a filter to a DOM-building-parser is another. > o Better integration with the DOM in general. The chief problem here is that SAX doesn't provide all the information that a DOM builder needs, notably the default value of attributes. > I'm sure others have their own feature list. If we can standardize feature control, then feature lists can be implemented in parsers or parser filters. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Mar 8 15:26:24 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:44 2004 Subject: URIs for features (was Re: SAX RFD: ModSAX Predefined Features) Message-ID: <01c501be6977$3a60ce00$0300000a@othniel.cygnus.uwa.edu.au> >...and RDF [1]. Giving interesting entities URIs makes them more >fully a part of the Web, and means we can take advantage of >URI-oriented metadata. Eg. you might search a software database for >resources that were of type 'Perl Module' and that implemented the >feature known as 'http://xml.org/sax/features/validation'. (There's >already a Linux Packages Database[2] along similar lines...). I'm not >claiming that this would be impossible using the Java naming scheme, >just that a Web oriented approach might make it easier to do certain >things... I wonder if this could be extended to more general features of XML software, not just SAX parsers. I wouldn't mind trying this out with XMLSOFTWARE.COM (http://www.xmlsoftware.com/). One of the problems that I have is with a canonical form of feature values for XML software like platform. URIs might provide just the solution. A Java 2 XSL processor conforming to the WD-xsl from 16th December 1998 might be specified in terms of http://java.sun.com/products/jdk/1.2/ and http://www.w3.org/TR/1998/WD-xsl-19981216 Actually, now that I think of it, we already have namespaces for content. They are called notations. There seems to be some link here. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Mon Mar 8 15:56:06 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:09:44 2004 Subject: Architectural Forms Questions References: Message-ID: <36E3F30C.F6D6DB51@mitre.org> Hi Folks, We have some beginner's questions on Architectural Forms. The motivation for this message is our interest in creation, discovery, sharing and reuse of mappings. - How powerful is the correspondence that you can express with Architectural Forms? Is it essentially limited to renaming and omission? - In addition to using Architectural Forms to express correspondences that are known a priori, could you use them to document mappings that are discovered "on-the-fly" by modifying a document or DTD after a mapping is discovered? - It appears to be the case that the correspondence between A and B must be documented in a way that keeps the mapping tightly coupled to either A or B. Are there any plans to represent the correspondence so that it is not tightly coupled to either A or B? - Is it a correct interpretation to say that Architectural Forms represent correspondence by overloading existing language constructs? - Given that subtyping and inheritance have been part of the primary XML "schema" proposals, is it likely that XML Architectural Forms will be overtaken by advances in the XML schema area? Thanks. /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Mon Mar 8 17:09:35 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:45 2004 Subject: Opinions requested In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au> References: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu> Message-ID: <3.0.5.32.19990308090238.00c74c90@library.berkeley.edu> Thanks for the update on SIM. It's definitely more advanced in its development than I thought. A few additional comments, and a clarification: At 03:40 PM 3/6/1999 +1100, Marcelo Cantos wrote: >More important than any specifics, however, is the issue of what you >call a DBMS. To me, a DBMS is a database management system (seems >painfully obvious, but I think it bears repeating). You may argue >that a product is not a DBMS if it does not support feature X, and I >don't entirely disagree. When one talks of a DBMS one is conjuring up >a certain image in the mind of the listener, and that image may well >include feature X. To be fair to SIM, however, the essence of a DBMS >is that it manages a collection of data. If it doesn't support >transactions, this does not entail that it does not manage data. >Rather it simply has limits on the way the data is managed (i.e. it >doesn't manage data as well as one would like). > >You clearly believe that transaction support is part of the essence of >what makes a DBMS. I disagree, indeed, I profoundly disagree. There >is nothing in the concept of a database that mandates any such >requirement. Rather I would say that transaction support is an >important issue for any _good_ DBMS. Likewise for referential >integrity and concurrency (and, for that matter, support for >declarative queries, use of indexes, a rich set of fundamental data >types, etc.). If I recall correctly, dBase III was generally >acknowledged to be a DBMS though it lacked most of these requirements, >and could barely even call itself relational! I agree with all of the above, and I didn't mean to particularly single out transaction support. In addition to the point you raise that a DBMS calls to mind a particular set of features (not all of which need to be present to qualify a system as a DBMS), I'd add that particular systems are developed based on previous work within a particular paradigm (oh man, referencing Kuhn before I've even had coffee -- been a grad student too long) and I see SIM as much more following in the lineage of IR systems than DBMS systems. I'll grant there's overlap, and SIM is obviously moving towards a graceful integration of the two areas, but I'd characterize it as moving from an IR engine towards a combined IR/DBMS system. >I guess this all boils down to what's in a name. At the end of the >day, it is far more important to know what a product does and does not >do than what you call it. > Agreed, but as you mentioned, particular names invoke an understanding of what a system does/what features it may be expected to support, etc. While these understandings may overlap from one person to the next, often they don't, and I think DBMS are an example of an area where they can mean quite different things to different people. Hence, the frequency of people saying 'DBMS don't handle SGML/XML' occuring side by side with people saying 'what, are you crazy? Of course they do.' >I am sceptical that any RDBMS vendor can come to the party in terms of >performance. Past attempts to try to force text into a relational, >table or object based paradigm have not reaped great success (Oracle's >ConText comes to mind as an example of how forcing a square peg into a >round hole requires sacrificing the edges of performance). I would be >surprised if any of the major database vendors would be prepared to >venture away from their core competency (the relational model) to >address the performance issues. > I share your skepticism, but we can hope. If nothing else, there appears to be at least the dawnings of an understanding among the major DBMS vendors that there's a huge market for text management/retrieval products. Some of the approaches taken by the object-oriented database folks, like Informix's data blades, struck me as having promise. >I strongly disagree that SIM doesn't handle SGML/XML well. Ah, now here, I'm afraid you're reading words into my mouth. To clarify, I think SIM handles SGML/XML very well indeed; one of the best I've seen, in fact. I said I don't think any DBMS handles SGML/XML well, but I also excluded SIM from the DBMS category. Sorry, I should have been clearer about that. >From what you've said, though, SIM does appear to be shaping up as a very interesting IR/DBMS hybrid. The referential integrity hooks are a very nice plus. I have one piece of advice: promote yourselves more! :) I looked over the SIM web site before my post, and didn't see any discussion of the new features you're working on. A few words about future directions you're exploring for your product would be a good thing. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Mar 8 17:17:40 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:45 2004 Subject: Java Specification Request for XML In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com> Message-ID: At 12:42 PM -0800 3/3/99, David Brownell wrote: >The Java Community Process is an open, inclusive process and we >look forward to the active particpation of all interested parties. > The process, and its relatnive openness, is a little more obvious if you remove the passive voice. compare this: >The process goes forward in several steps: > >[1] The JSR is presented for comment (as you've seen) >[2] The JSR is approved (we hope) >[3] An expert group is formed to write the specification; this > begins with a "Call for Experts" (CAFE) to participate. >[4] The expert group writes a first draft of the specification >[5] The draft is circulated to all Java technology licensees and > Participants in the Java Community Process. >[6] Comments are collected, read, and responded to by the expert > group, resulting in an improved specification. >[7] The refined specification is then released to the public for > comment. >[8] Comments from the public are collected, read, and responded > to by the expert group, resulting in more refinements. >[9] The final specification is produced by the expert group, along > with a reference implementation and compatibility tests. > to this: [1] Sun presents the JSR for comment (as you've seen) [2] Sun's Process Management Office approves the JSR. [3] Sun forms an expert group to write the specification; this begins with a "Call for Experts" (CAFE) to participate. [Sun chooses the leader of the group, who then chooses the remainder of the experts.] [4] The expert group writes a first draft of the specification [5] Sun circulates the draft to all Java technology licensees and Participants in the Java Community Process. [that is, companies who have paid Sun thousands of dollars to do this] [6] The expert group collects, reads, and responds to comments, resulting in an improved specification. [7] Sun releases the refined specification to the public for comment. [8] The expert group collects, reads, and responds to comments, resulting in more refinements. [9] The expert group produces the final specification, along with a reference implementation and compatibility tests. >The key point is that everyone with internet access will get a >chance to review and comment on the emerging specification. > They can review and comment. There's no promise that anyone will even listen to their comments, much less act on them. There are a number of aspects of this "open" process that aren't mentioned here. 1. It costs between $2,000 (educational) and $5,000 (commercial) dollars to participate as an expert. 2. Sun owns the copyright and other intellectual property rights related to the spec. As owner, they will not allow derivative works they decide are incompatible. 3. Participants in the expert group can't talk about the ongoing work with outsiders. 4. Only company employees are allowed to be experts. Freelancers like many of those who participated in the development of SAX and XML are excluded. This is similar to W3C procedures, but the W3C allows exceptions for recognized experts. Sun does not. To me these alone make it pretty clear, that this process is open in name only. If you're still not convinced, ask yourself these questions: 1. Can anyone tell Sun No? Can anyone keep Sun from putting something into the spec they want to put it in? Or put something in that Sun wants to keep out? 2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process on an equal footing with Sun? Can they even participate at all? Bottom line: The openness of this process is PR, pure and simple. When you actually read the fine print, all Sun does is agree to let other companies contribute their time, money, and knowledge to help Sun do what it wants to do anyway. That may be intelligent business, but it's not an open, community based process for developing standards. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 19:52:05 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E3E712.D5556233@locke.ccil.org> References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> Message-ID: <14052.10627.837114.651600@localhost.localdomain> John Cowan writes: > David Megginson wrote: > > > public abstract void setFeature (String featureID, boolean state) > > throws SAXNotSupportedException; > > I want to propose a restriction and an extension: > > 1) This method cannot be called after any other parser method > has been invoked. Wouldn't it be better to allow the parser/filter make that decision? If the user attempts to change something during a parse that should *not* be changed during a parse, the parser/filter can throw a SAXNotSupportedException. > 2) This method is allowed to throw a SAXNewParserException, which > encapsulates a replacement parser. The application should use > the parser inside the exception in place of the original parser. > This allows parsers to push filters on top of themselves, which > complements the ability of applications to push them. I think that this could be layered on top of SAX, simply by subclassing SAXNotSupportedException. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Mon Mar 8 22:02:37 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> Message-ID: <36E44898.CB8E18C4@thinlink.com> David Megginson wrote: > As I wrote before, it doesn't much matter whether we use Java property > names incorporating domain names (like > 'org.xml.sax.features.validation') or URIs (like > 'http://xml.org/sax/features/validation'), as long as we have the > ability for people to create new names without fear of collision. I would also urge against using an http: URI since it is not meant that a resource actually be retrieved using the http protocol. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Mar 8 22:12:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <14052.10627.837114.651600@localhost.localdomain> Message-ID: <36E44B40.4303A066@locke.ccil.org> David Megginson wrote: > Wouldn't it be better to allow the parser/filter make that decision? Yes. > > 2) This method is allowed to throw a SAXNewParserException, which > > encapsulates a replacement parser. The application should use > > the parser inside the exception in place of the original parser. > > This allows parsers to push filters on top of themselves, which > > complements the ability of applications to push them. > > I think that this could be layered on top of SAX, simply by > subclassing SAXNotSupportedException. Yes, but by making it part of the core SAX protocol for setting features, we guarantee universal support for it. A parser that knows itself to be naive about namespaces can load the NamespaceFilter and push it on top of itself, almost transparently to the application. Otherwise, every application that wants namespace support needs specialized knowledge about how to recover from SAXNotSupportedExn. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 22:27:50 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query Message-ID: Hi David, In a message dated 3/8/99 12:19:24 PM Eastern Standard Time, david@megginson.com writes: > Yes, but what about filters that perform specialised actions? And > what about adding support (stable or experimental) for new XML-related > features like schemas, datatyping, and linking as they become > available? You are absolutely right that extensibility is important. And, as you also stated, both naming schemes provide that ability. > As I wrote before, it doesn't much matter whether we use Java property > names incorporating domain names (like > 'org.xml.sax.features.validation') or URIs (like > 'http://xml.org/sax/features/validation'), as long as we have the > ability for people to create new names without fear of collision. Why do you need a domain name in there? I think one Parser/Filter implementor would be loathe to implement another companies feature name if it had sun.com or microsoft.com in it. That was the chief problem that developers had with Sun naming the Swing package com.sun.swing. I thought your features would have a single root tree like: sax.feature So that all features would be: sax.feature.whatever.myfeature as well as sax.props (for properties) Now, I understand the domain name being in there is a piggyback off of DNS. But, I still believe that functional features (of both Parser and Filters) are a finite domain -- whereas the web is not. That is why I don't see the correlation between this feature set and XML namespaces. If you agree that features and props are a finite domain (and in the whole scheme of things a rather small one), then a single naming tree should suffice. Also, Daniel Brickley mentioned a Java bias. I can understand his concern; heck, let's separate them with the delimiter of your choice (hyphens, underscore, etc.). While we are on the subject of bias: a URI has a resource/file system bias. To me, that bias was just confusing (and overkill) for something that I felt was best expressed with one word String constants (if you added the initial default set to the interface). Lastly, I would like to say that I do like your idea for the general property query and am glad you proposed it. The naming concerns I express here I deem as minor issues. Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 22:32:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <36E44898.CB8E18C4@thinlink.com> References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> Message-ID: <14052.19853.887104.987727@localhost.localdomain> Tom Harding writes: > David Megginson wrote: > > > As I wrote before, it doesn't much matter whether we use Java property > > names incorporating domain names (like > > 'org.xml.sax.features.validation') or URIs (like > > 'http://xml.org/sax/features/validation'), as long as we have the > > ability for people to create new names without fear of collision. > > I would also urge against using an http: URI since it is not meant > that a resource actually be retrieved using the http protocol. I've been thinking about this issue, and I'm fairly convinced that the URI is the right choice. Think of the URI a statement of ownership. Assume that my ISP is host.net, and that I've been allocated 5MB of web space at http://host.net/foo/. I am the only one who has the right to make a resource available at http://host.net/foo/, so I am the one who has the (moral) right to construct feature IDs based on http://host.net/foo/. It is not sufficient simply to use the domain name "host.net", because I don't own the domain (someone else could construct the same feature ID), and it is not sufficient to use something starting with net.host.foo, because I *don't* have the right to make something available at, say, ftp://host.net/foo/ -- host.net has made the foo available to me only through the HTTP protocol. Perhaps Foo enterprises has a download directory at ftp://host.net/foo/, and they might want to construct their own property ID based on it. Namespaces seems to have got it right. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Mar 8 22:42:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: References: Message-ID: <14052.20696.226477.386853@localhost.localdomain> MikeDacon@aol.com writes: > Why do you need a domain name in there? I think one Parser/Filter > implementor would be loathe to implement another companies feature > name if it had sun.com or microsoft.com in it. That was the chief > problem that developers had with Sun naming the Swing package > com.sun.swing. A neutral .org domain usually provides a nice way around that problem. > Now, I understand the domain name being in there is a piggyback off > of DNS. But, I still believe that functional features (of both > Parser and Filters) are a finite domain -- whereas the web is not. > That is why I don't see the correlation between this feature set > and XML namespaces. If you agree that features and props are a > finite domain (and in the whole scheme of things a rather small > one), then a single naming tree should suffice. I expect the number of features to grow slowly, but I do not think that it is clearly bounded, especially not with all the XML-related work going on right now. A couple of years from now we could have data-typing, digital signing, and who knows what else. Furthermore, I do not want to have to set up my own registration authority, and I do not want developers to have to wait for anyone to approve their feature names before they can ship. Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 22:58:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:45 2004 Subject: Naming ModSAX features: good use for the 'java:' URI scheme? In-Reply-To: <36E44898.CB8E18C4@thinlink.com> Message-ID: On Mon, 8 Mar 1999, Tom Harding wrote: > David Megginson wrote: > > > As I wrote before, it doesn't much matter whether we use Java property > > names incorporating domain names (like > > 'org.xml.sax.features.validation') or URIs (like > > 'http://xml.org/sax/features/validation'), as long as we have the > > ability for people to create new names without fear of collision. > > I would also urge against using an http: URI since it is not meant that a resource actually be > retrieved using the http protocol. I think I've found a compromise of sorts that'll let us use the Java naming scheme (for those uncomfortable with naming conceptual entities in the http namespace), whilst still using URIs. >From http://www.w3.org/Addressing/schemes.html Addressing Schemes This is (an attempt at) an exhaustive list of URI schemes. I try to list them all, whether they're standard or not. Under 'J' we find a useful looking entry... java: identifies java classes (@@spec?) javascript: There's also a reference to a JavaRMI: URI schema invented by Bill Jansen, which would be interesting to track down. But anyway... So... here's the proposal: Naming ModSAX Features ModSAX is intended to be easily extensible, and is designed to anticipate future independently developed extensions ('features'). For ModSAX-aware software to cope with the decentralised evolution of new features, it is important to have a controlled mechanism for naming these features unambiguously. For this we adopt the Uniform Resource Identifier (URI) system defined in RFC 2396[URI]. Each (version of a) ModSAX feature should be assigned a unique URI. It should not be assumed that these identifiers can always be dereferenced to acquire further information about the feature they name. For example, the 'http:' scheme and 'java:' schemes can be used. 'http://purl.org/net/sax/MyFeature' and 'java:org.desire.sax.MyFeature' are both legitimate names for SAX features. 'phone:+44-117-9287493' would not be an appropriate name, since the 'phone:' URI namespace can only be used for telephone numbers. This way, people who manage http: URI names and want to use them to name SAX features are free to do so. Others can piggyback on the DNS via the java: scheme instead. But both through the same overarching approach. So... It would be nice to have a reference to some spec defining the 'java:' URI scheme mentioned at http://www.w3.org/Addressing/schemes.html Maybe somebody from Sun has a pointer to this...? BTW as a side effect of having a URI scheme for Java classes and intefaces, we can exchange (aggregate, search, reason over) RDF metadata about those resources. This would be handy in Sun's JINI amongst other places.... Here's a quick and dull example of metadata keyed off a java: URI... Dan Brickley and Larry Franklin This applet is an attempt at a metadata browsing tree control But I'm sidetracking again. I'm really just saying one thing: the existence of a URI schema for Java classes (and packages) means we don't need to choose between Java and URI naming formalisms. We can have the best of both worlds... Dan [URI] Uniform Resource Identifiers (URI): Generic Syntax; Berners-Lee, Fielding, Masinter, Internet Draft Standard August, 1998; RFC2396. http://www.isi.edu/in-notes/rfc2396.txt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Mar 8 23:01:23 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: References: Message-ID: * David Megginson | | - org.xml.sax.features.validation is more of a Java flavour. * Dan Brickley | | Yep... but might not feel so natural for developers working with | versions of SAX translated for Perl, Python and so on. I'll be translating this into Python and I see absolutely no problems with this from that point of view. It's a natural way to use the DNS as a basis for a naming system and Java just happens to use it. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Mon Mar 8 23:03:34 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <8246f301.36e4560e@aol.com> Hi David, In a message dated 3/8/99 5:38:55 PM Eastern Standard Time, david@megginson.com writes: > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > This is the primary reason I disagree with using a URI. A feature is not a resource. Also, a standard interface to a set of features is not the place to invoke ownership priviledges. You can't own a feature that you expect others to implement. Unless I am not getting your idea of a feature, your logic seems incorrect. Interesting discussion and process (well worth it), - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 23:09:36 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:45 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <14052.19853.887104.987727@localhost.localdomain> Message-ID: On Mon, 8 Mar 1999, David Megginson wrote: > Tom Harding writes: > > David Megginson wrote: > > > > > As I wrote before, it doesn't much matter whether we use Java property > > > names incorporating domain names (like > > > 'org.xml.sax.features.validation') or URIs (like > > > 'http://xml.org/sax/features/validation'), as long as we have the > > > ability for people to create new names without fear of collision. > > > > I would also urge against using an http: URI since it is not meant > > that a resource actually be retrieved using the http protocol. > > I've been thinking about this issue, and I'm fairly convinced that the > URI is the right choice. > > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > [...] Just to head off one possible objection... that of the persistence (or lack of) w.r.t. http URLs. The PURL folks (Persistent URLs) make a credible case when they argue that URLs can be managed just a responsibly as URNs, and that persistence of http naming is a social issue not a technical one. PURL servers are available to help here -- eg XML-DEV's own XSchema (now DDML) pages have been available from several different http servers, but have always had the same URI: http://purl.oclc.org/NET/xschema The PURL server at that address sends an HTTP redirect messge if you try to derefence it. So we could for eg use PURLs to name software features, with reassurance that PURL.ORG have committed to do their best to manage http://purl.org/* names responsibly. > Namespaces seems to have got it right. Yep. Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Mar 8 23:13:49 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:45 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <01ab01be69b8$39f1cf00$c9a8a8c0@thing2> From: John Cowan >> > 2) This method is allowed to throw a SAXNewParserException, which >> > encapsulates a replacement parser. The application should use >> > the parser inside the exception in place of the original parser. >> > This allows parsers to push filters on top of themselves, which >> > complements the ability of applications to push them. >> >> I think that this could be layered on top of SAX, simply by >> subclassing SAXNotSupportedException. > >Yes, but by making it part of the core SAX protocol for setting >features, we guarantee universal support for it. A parser that knows >itself to be naive about namespaces can load the NamespaceFilter and >push it on top of itself, almost transparently to the application. >Otherwise, every application that wants namespace support needs >specialized knowledge about how to recover from SAXNotSupportedExn. There are really three approaches here: 1. An application pushes a filter "on top of" a parser. In this case, the application starts with a parser and chooses to augment it with a filter. 2. The application requests a feature of the parser and the parser elects to wrap itself in a filter. For efficiency reasons(?), it asks the application to now use the filter in place of itself. 3. An application works with a pseudo-parser. It asks for various features and the pseudo-parser selects a parser and a set of filters which together can deliver the requested capabilities. I do like David's proposal--its pretty open ended. The method get(infoID) will even serves as a front-end for aggregation! But I see a problem in trying to go too far on the feature selection path. The assumption seems to be that we are dealing here with a completely orthogonal set of features which are just selected or not as needed. There is no sense of structure or architecture here. I'm not sure that this is a useful model. Frankly, I much prefer Simon's layered approach: http://www.simonstl.com/articles/layering/layered.htm Again, I'm happy with the interface, but this idea of creating filter structures based on feature selection seems a bit lame. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Mar 8 23:18:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <8246f301.36e4560e@aol.com> Message-ID: On Mon, 8 Mar 1999 MikeDacon@aol.com wrote: > Hi David, > > In a message dated 3/8/99 5:38:55 PM Eastern Standard Time, > david@megginson.com writes: > > Think of the URI a statement of ownership. Assume that my ISP is > > host.net, and that I've been allocated 5MB of web space at > > http://host.net/foo/. > > > > This is the primary reason I disagree with using a URI. > A feature is not a resource. Software features aren't files, nor are they HTML pages, but they are 'resources' as defined in RFC2396 and as used in the XML Namespaces and RDF recommendations from W3C. I'm getting *really* boring on this topic... ;-) >From RFC2396 (online at http://www.isi.edu/in-notes/rfc2396.txt) A Uniform Resource Identifier (URI) is a compact string of characters for identifying an abstract or physical resource. [...] Resource A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. > Also, a standard interface to a set > of features is not the place to invoke ownership priviledges. You can own (or manage) the name for the feature though. Javasoft own all the URIs beginning 'java:java.lang.*'; I own the URIs beginning 'java:org.desire.rudolf.rdf.*'. These can name classes or interfaces others might implement. Dan > You can't own a feature that you expect others to implement. > > Unless I am not getting your idea of a feature, your logic seems > incorrect. > > Interesting discussion and process (well worth it), > > - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 00:19:02 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <021a01be69c1$a49538c0$c9a8a8c0@thing2> From: David Megginson >I expect the number of features to grow slowly I suspect otherwise. Especially since the interface would also be used by filters and DOMWlakers. Think of the get and set methods as ways of accessing the properties on filters which are part of some larger filter structure (a stack being the simplest case). In addition to parse events moving from parser-kernel to application via a series of filters and event routers, the get and set "events" move from the application through the filters and down to the parser-kernel. Think of the parser and the filters together as a large aggregate of components. The get, set, setFeature, and setHandler may well be intercepted by any component in that aggregate which recognizes the featureID, handlerID, or infoID. I see the ModParser interface as currently defined as being very important for filters, with the number of featureIDs growing with the popularity of such filiters. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Mar 9 00:34:35 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:46 2004 Subject: Opinions requested Message-ID: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca> At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote: >I share your skepticism, but we can hope. If nothing else, there appears >to be at least the dawnings of an understanding among the major DBMS >vendors that there's a huge market for text management/retrieval products. >Some of the approaches taken by the object-oriented database folks, like >Informix's data blades, struck me as having promise. There's the rub. *Is* there really a huge market for text management/retrieval? The history of software is littered with the corpses of companies who tried to make a go of it in that area; I know from personal experience that up to and through the year 1996, there was *not* any such huge market. Will XML change that? It would be nice to think so. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Tue Mar 9 00:52:11 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:46 2004 Subject: Namespaces and DTDs Message-ID: <36E49A4D.413D71F3@metalab.unc.edu> Situation: I have several DTDs with conflicting definitions of certain elements. (e.g one defines a HEAD as a TITLE followed by a META and another defines a HEAD as #PCDATA). I need to use all the DTDs and associated markup languages for a single document. To an extent I can disambiguate them with namespaces. However, is there any way I can do this while still validating against the orignal DTDs? That is without rewriting the DTDs to use the qualified names instead of the orignal names that are in the DTDs? I've been trying to work with default values for xmlns attributes, and the like; but that doesn't seem to get me quite all the way to where I need to go. Am I going to have to break down and just rewrite the DTDs to use the qualified names? -- Elliotte Rusty Harold xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dent at oofile.com.au Tue Mar 9 02:03:11 1999 From: dent at oofile.com.au (Andy Dent) Date: Mon Jun 7 17:09:46 2004 Subject: Expat API In-Reply-To: References: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1> Message-ID: >My question is: where is the documentation on how to use the expat >API? I downloaded version 1.0.2 and ported the code to run the sample >program on my Macintosh, but I'm pretty much dead in the water. I >tried sending email to the author (James Clark) twice in the last few >days, but I have so far failed to receive a response. The comments in >the header files do not seem to be sufficient. Dave We have a c++ wrapper on expat running under CodeWarrior as part of a much bigger project to make our report writer interchange data with XML. You're welcome to a copy. It makes the expat API a LOT easier to use if you are a c++ programmer as it presents a virtual method interface to expat - you inherit from our object and override the methods (eg: startElement) that you want to use. When it's a bit more cleaned up with better samples I'll be submitting it back to James. Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows PP2MFC - PowerPlant->MFC portability http://www.highway1.com.au/adsoftware/crossplatform.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Tue Mar 9 05:31:02 1999 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:09:46 2004 Subject: Opinions requested In-Reply-To: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca> Message-ID: At 4:37 PM -0800 3/8/1999, Tim Bray wrote: > At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote: >>I share your skepticism, but we can hope. If nothing else, there appears >>to be at least the dawnings of an understanding among the major DBMS >>vendors that there's a huge market for text management/retrieval products. >>Some of the approaches taken by the object-oriented database folks, like >>Informix's data blades, struck me as having promise. > > There's the rub. *Is* there really a huge market for text > management/retrieval? The history of software is littered with the > corpses of companies who tried to make a go of it in that area; I > know from personal experience that up to and through the year 1996, > there was *not* any such huge market. Will XML change that? It > would be nice to think so. -Tim The Web has certainly raised the profile for text retrieval, and the amount of text online is larger than its ever been. A lot of text-management turns out to be going on in relational databases, and those are pretty big business. But the large content-management companies -- Verity, Open Text, Fulcrum (bought by PCDOCS recently bought by someone else) -- seem to be going through wild stock price variations recently. I've no idea what the future market will be: I find it all mystifying! BTW, Lisa Rein has written a report on the Query Language '98 workshop at W3C last year: http://www.xml.com/xml/pub/1999/03/quest/index.html It looks quite comprehensive to me, and all the position papers indicate that the topic is a hot one. Avi ________________________________________________________________ Avi Rappoport, Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 05:36:44 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: ModSax Suggestion References: Message-ID: <36E4B1FA.E482164@eng.sun.com> > > Interesting suggestion for a big hole in the parts of > > the Java API set that are more or less "standard" at > > this poit -- SAX and DOM. > > > > One comment though: I've found that it's important to > > be able to have options controlling how the DOM tree is > > built. For example, whether to discard ignorable spaces, > > or do namespace conformance enforcement, or try to get > > CDATA sections (comments, etc). > > > > I agree with that. I think all that is possible while still retaining > a minimalist design philosophy. [deletia] > > That way via an extensible common set of text properties we > can add properties as the need arises without expanding the API. I've always liked the idea of filters in the SAX event chain. As Bill la Forge (and you) noted, that's a fine way to address that general issue. One can overdo layers, of course, and pay for it in performance. But filters are a good architectural notion, and there's been lots of discussion about how to use them well with SAX and DOM. That does imply keeping DOM out of the basic parser API, which I still think is the best way to go. An event generator (say, a SAX parser, or something walking a DOM tree) can have its events filtered, and delivered to acomponent building a DOM tree. > Looking forward to progress on the Java XML API. BTW, Dave, > are you going to do a "Birds of a Feather" session on XML at this years > JavaOne? I think that could be valuable. I may be signed up for more than that this time... A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. Some of the folk here have never met in person. I think there will be lots of interesting applications to talk about ... and probably some interesting frameworks. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 05:55:30 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query References: <14051.46670.687235.664451@localhost.localdomain> Message-ID: <36E4B666.411E15C2@eng.sun.com> OK, I'll pick this thread rather than the longer one to read first... XML-DEV really generates lots of traffic lately!! - I agree re using URIs, like Namespaces do. Anyone can get a URI nowadays, for virtually no cost, but that's not true of reversed domain names (as used in Java properties and package names). - There will need to be some strong policies for how the "things" to which an {info,handler,feature}ID map are documented. I think that leadership by example can play a strong role here ... :-) Related point, that policy should specify the status of the "thing". For example, "stable", "beta", "experimental", "private", to pick an order where folk should be progressively less willing to use or implement the "thing" in a parser. - I'd like a "getHandler" API ... or perhaps, eliminate the notion of 'feature' and 'handler' IDs and just use "infoID" values that map to the appropriate handdlers. I've found it important to be able to do things like, say, "use the error handler everyone else is using". (Where's getFeature? One can return a Boolean from a "get" ...) Re that last point, I might have missed some e-mail and will try to catch up. It's not clear why there's a need for more than a single general get/set API for this. - Dave David Megginson wrote: > > What: Additions to ModParser interface > > I'm proposing a couple of additions to the ModParser interface: > > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > > public abstract void set (String infoID, Object prop) > throws SAXNotSupportedException; > > public abstract Object get (String infoID) > throws SAXNotSupportedException; > } > > These allow you to do interesting things like > > parser.set("http://www.foo.com/props/textfilter", filter); > > or > > try { > Node node = parser.get("http://xml.org/sax/props/dom-node"); > } catch (SAXNotRecognizedException e1) { > // doesn't know about DOM processing... > } catch (SAXNotSupportedException e2) { > // knows about DOM processing, but not doing it... > } > > Again, it's a little sloppy as an interface, but it's beautifully > extensible and it supports filters nicely (if there are other filters > between the DOM iterator and the application, it will still work). > > Note that strictly speaking, now, setHandler() and setFeature() are no > longer primitives, since they could both be implemented in terms of > set(), but I think that the extra type checking is worthwhile in those > cases. > > All the best, > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 06:27:39 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX RFD: ModSAX Predefined Features References: <004b01be691c$f348fc40$c9a8a8c0@thing2> Message-ID: <36E4BDC9.DB06F185@eng.sun.com> Lars Marius Garshol wrote: > > * Bill la Forge > > | So that's why I'm butting in here. I think an open standards process > | is important for individuals and small companies. We need to do what > | we can to keep the ball rolling here. > > We are certainly in heartfelt agreement here. :) Gee, as a wage-slave working for a big company, I hope that I'm not _too_ excluded from the discussions ... :-) Seriously: my personal model is a lot more akin to the original IETF style "running code and working consensus" model than most existing standards bodies. I'm a lot happier with standards that come from such a process than from ones that involve fat specs that can't be implemented. Writing code is generally more fun than specs -- though an elegant spec is also a work of art! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 06:57:28 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:46 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> Message-ID: <36E4C4E6.B51DDFF3@eng.sun.com> Again, I think that unifying these under the generic get/set API (with Boolean.TRUE and Boolean.FALSE objects as values for features that are really boolean) could be useful. Documentation for each feature should specify whether it's changeable mid-parse ... I'd suggest "no" as the default answer! Mike Dacon commented about the "API archaeology" aspect of this name; perhaps the "Parser2" style naming convention can avoid losing technical context (i.e. this is still a parser, even if it's parsing a DOM or a stream of SAX events :-). > 1. http://xml.org/sax/features/validation Good. (I'm curious if folks prefer one parser, which can have this feature toggled, vs two, where the parser comes with at least an initial value.) > 2A. http://xml.org/sax/features/external-general-entities > 2B. http://xml.org/sax/features/external-parameter-entities Right, two kinds of parsed entities, two control knobs. Validating parsers must refuse to change these knobs. (OK, _five_ kind of parser -- validating, and four kinds of nonvalidating parser! ;-) > 3. http://xml.org/sax/features/namespaces I'd rather have this just kick in modified XML syntax rules (e.g. entity names may never be scoped, and scoped names may have only one interior colon). With that, one can layer the rest of namespace processing on top in any of several fashions. A DOM can be built which exposes namespace declarations; or a filter can munge names and strip out the declarations. The "munge" feature could get its own namespace URI. > 4. http://xml.org/sax/features/unbuffered-input > True means ensure that the parser does not buffer input from a > Reader or InputStream supplied by the application (actually, > one-character look-ahead will usually be required); false means do > not ensure that the parser does not buffer input. This feature might > be useful for reading multiple documents from a single stream. I'm not sure this is a common enough feature to need to be predefined ... support for "XML Islands" within HTML may become important, but much of this can be done (at least in Java) by requiring pushback to be done at appropriate points. > http://xml.org/sax/features/normalize-text This is a good filter feature, I think. Lars suggested a "Catalog" feature. There are different sorts of catalog, and they need configuration, so the value of this could be a URI for the catalog, not just a boolean. Plus, this would seem to be up to the "EntityResolver" to handle ... yes? It'd perhaps suggest that one could ask the next filter in the stream for the resolver it was using ... :-) Good discussion, gang! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Tue Mar 9 08:58:11 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:46 2004 Subject: version within XML Message-ID: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK> Hi all, I am seeking info on versioning XML documents. I have seen it done in a few different ways. Specifically what are the issues to ensure backward comparability between versions. Any help is appreciated. adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 9 09:43:35 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:46 2004 Subject: Namespaces and DTDs Message-ID: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Elliotte Rusty Harold wrote: > I have several DTDs with conflicting definitions of certain elements. > (e.g one defines a HEAD as a TITLE followed by a META and another > defines a HEAD as #PCDATA). I need to use all the DTDs and associated > markup languages for a single document. > > To an extent I can disambiguate them with namespaces. However, is there > any way I can do this while still validating against the orignal DTDs? > That is without rewriting the DTDs to use the qualified names instead of > the orignal names that are in the DTDs? I've been trying to work with > default values for xmlns attributes, and the like; but that doesn't seem > to get me quite all the way to where I need to go. Am I going to have to > break down and just rewrite the DTDs to use the qualified names? If you want to use a namespace-unaware parser, I don't see how you can avoid rewriting the DTDs. Unless the names in the DTDs are qualified, you will have two elements with the same name (e.g. "HEAD"), which is a validation error. And even assuming that this isn't immediately flagged, I can see no way for a namespace-unaware parser to figure out which content model to validate against when it encounters one of the duplicated element names: If prefixes are used, the name won't match any of the DTD names; if prefixes are not used (due to use of defaults), the name will match multiple DTD names. Note that this problem is not limited just to validation. At the very least, it applies to retrieving default attribute values as well. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Mar 9 10:00:50 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk> > I've been thinking about this issue, and I'm fairly convinced > that the URI is the right choice. > > Think of the URI a statement of ownership. Assume that my ISP is > host.net, and that I've been allocated 5MB of web space at > http://host.net/foo/. > I don't often disagree with David, but I think this is quite misguided. If we're only after a unique identifier we could use the longitude and latitude of the house where I live. In fact that would be better, because it identifies a unique place, whereas the "http:" idea also says you can get there by bus and the buses are run by the host.net bus company: in fact it invites you to "click here" to jump on the bus. But if you get on the bus and ask for the destination the driver will tell you "Never heard of it, guv." And of course it ignores the fact that you can have two buses going to the same place from different directions. Just because Namespaces made this mistake (and confused all newbies by doing so) doesn't mean we have to as well. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Tue Mar 9 10:25:49 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:46 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk> Message-ID: On Tue, 9 Mar 1999, Kay Michael wrote: > > I've been thinking about this issue, and I'm fairly convinced > > that the URI is the right choice. > > > > Think of the URI a statement of ownership. Assume that my ISP is > > host.net, and that I've been allocated 5MB of web space at > > http://host.net/foo/. > > > I don't often disagree with David, but I think this is quite misguided. > > If we're only after a unique identifier we could use the longitude and > latitude of the house where I live. Great. Why not propose a URI scheme for it? (although this would also confuse people as a place is something you'd look up on a map, not a software feature.) In fact that would be better, because it > identifies a unique place, whereas the "http:" idea also says you can get > there by bus and the buses are run by the host.net bus company: in fact it > invites you to "click here" to jump on the bus. But if you get on the bus > and ask for the destination the driver will tell you "Never heard of it, > guv." > > And of course it ignores the fact that you can have two buses going to the > same place from different directions. The URI spec very clearly does not ignore this point. >From RFC 2396 again... (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt) 1.2. URI, URL, and URN A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. [...] Although many URL schemes are named after protocols, this does not imply that the only way to access the URL's resource is via the named protocol. Gateways, proxies, caches, and name resolution services might be used to access some resources, independent of the protocol of their origin, and the resolution of some URL may require the use of more than one protocol (e.g., both DNS and HTTP are typically used to access an "http" URL's resource when it can't be found in a local cache). > Just because Namespaces made this mistake (and confused all newbies by doing > so) doesn't mean we have to as well. Making the same mistake as the rest of the world has its benefits though: if we use URIs for ModSAX features, we get for free any progress on better naming infrastructure (URNs, metadata, resolution infrastructure layered over the Web caching network etc). If we invent another a nameless, specless naming system, we're on our own. Dan -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9288478 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 10:43:27 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Kay Michael To: XML Developers' List Sent: 09 March 1999 09:54 Subject: RE: SAX: ModSAX addition, general property query >> I've been thinking about this issue, and I'm fairly convinced >> that the URI is the right choice. >> >> Think of the URI a statement of ownership. Assume that my ISP is >> host.net, and that I've been allocated 5MB of web space at >> http://host.net/foo/. >> >I don't often disagree with David, but I think this is quite misguided. I agree - I don't actually see the benefit of using a string identifier at all: I don't think that it's unreasonable to insist that objects representing a Feature, Handler or Property should either implement a distinct interface or subclass a distinct class. If this is so the Parser can tell what Feature, Handler or Property is being set by enquiring of the type of the object. (I favour insisting that they subclass distinct classes because (in Java) that naturally imposes the restriction that a single object can only represent a single Property.) The get() member function could take a Class parameter. The advantage of this approach is that it relies only on the type naming scheme of Java and there are already well established mechanisms that ensures that different implementers create distinct types. I am by no means an expert in the other languages that are supported by SAX - would this approach cause dreadful problems in other languages? John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 11:09:31 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML References: Message-ID: <36E4FF9B.8CA35A47@eng.sun.com> Elliotte Rusty Harold wrote: > > >The Java Community Process is an open, inclusive process and we > >look forward to the active particpation of all interested parties. > > The process, and its relatnive openness, is a little more obvious if you > remove the passive voice. compare this: When you change it to what you wrote, it is no longer correct. Some key points: - No, Sun doesn't need to submit all JSRs. Any Participant can do so. We did for this one, to help jumpstart the process; many people want to see a Java Platform API for XML. - Yes, Sun's Program Management Office (vs. say Ken Starr) approves or rejects submitted JSRs. - No, the leader of the expert group doesn't need to be from Sun. The group formed by that leader, from the pool of volunteer experts and from external invited experts, is supposed to be a diverse cross section. This is auditable. - Re cost to be a "Participant", I had the same comment. The fee can be waived for invited experts. And note that the fee is less than an expert's time will cost -- much less! Sun is working with this process in good faith, though you seem to fear otherwise. Re other processes ... I don't think anyone's quite figured out how to make the "open source" processes drive established software companies. Like many leading companies, Sun is taking steps in that direction. But at least for this year, that isn't a useful class of processes to measure against. > >The key point is that everyone with internet access will get a > >chance to review and comment on the emerging specification. > > They can review and comment. There's no promise that > anyone will even listen to their comments, much less act on them. No, there _is_ a promise they'll be listened to; and I understand the action will at least include a response. Have you ever participated in the comment process for an IEEE spec? One submits comments, and gets formal responses. (I seem to recall it being restricted to paid-up IEEE members though.) That's the model to keep in mind -- not the "black hole" model you've described. Again, this is auditable. > There are a number of aspects of this "open" process that aren't mentioned > here. Paraphrasing points I didn't mention above: - Copyright and other Intellectual Property Rights. Hmm, wouldn't you just hate to base a product on a specification, and then find that you've got to fork over $5K/copy to use it? Have a look at what any of the "Open Source" license agreements (e.g. MPL2) say about such issues. - Derivative works. Nobody wins if people are allowed to ship things as "compatible" that really aren't; that's what the compatibility test suite is there to help ensure: "Write Once, Run Anywhere" does not come without effort, and it's a Big Deal. - Pillow talk. It's supposed to be private. - Of course non-corporate experts exist; always have, always will. And they can participate too. > To me these alone make it pretty clear, that this process is open in name > only. If you're still not convinced, ask yourself these questions: > > 1. Can anyone tell Sun No? Can anyone keep Sun from putting something into > the spec they want to put it in? Or put something in that Sun wants to keep > out? If the Expert Group disagrees with Sun's representative, that could happen. I'd hope it wouldn't -- but it could happen. > 2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process > on an equal footing with Sun? Can they even participate at all? Can those companies participate? Absolutely. Though I don't think that they've wanted to do so -- going purely by what the press has been seen to report. > Bottom line: The openness of this process is PR, pure and simple. So is that glass half full, or half empty? :-) "Openness" fits on a spectrum. I think that this process compares favorably with most other standards processes I've seen. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Andy.Bradbury at syntegra.bt.co.uk Tue Mar 9 11:18:45 1999 From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk) Date: Mon Jun 7 17:09:47 2004 Subject: X for eXtensible DBMS? Message-ID: <65AF45D5E535D2118AFB0008C7FA23180C3D08@FL-EXCHANGE-03> The only IMS I ever came across was hardly what I'd call 'extensible' - not unless you actually *like* taking a whole database down in order to create or modify a single extra link ;,) Regards Andy B. -----Original Message----- From: Smith, Adrian [mailto:asmith@drumbeat.com] Sent: 05 March 1999 17:19 To: 'Jeffrey E. Sussna'; 'Chad Adams'; xml-dev@ic.ac.uk Subject: RE: Opinions requested There actually is an XDBMS. It predates XML. This dates back to around 1965/1966. The database created was titled "IMS" for Information Management System, it was created by IBM and used an hierarchical model for the data. It had all the same characterstics of XML with almost the exact same set of constructs and shortcomings. Thanks! Adrian xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:00:06 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion Message-ID: <005001be6a23$7e574240$c9a8a8c0@thing2> From: David Brownell >> Looking forward to progress on the Java XML API. BTW, Dave, >> are you going to do a "Birds of a Feather" session on XML at this years >> JavaOne? I think that could be valuable. > >I may be signed up for more than that this time... > >A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. >Some of the folk here have never met in person. I think >there will be lots of interesting applications to talk >about ... and probably some interesting frameworks. Simon and I proposed a Coins BOF some time back and JavaOne accepted it. Might be a good place to meet and discuss ModSAX, filters, and such. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:16:09 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion Message-ID: <006301be6a25$c42a6700$c9a8a8c0@thing2> From: David Brownell >I've always liked the idea of filters in the SAX event chain. >As Bill la Forge (and you) noted, that's a fine way to address that >general issue. One can overdo layers, of course, and pay for it >in performance. But filters are a good architectural notion, and >there's been lots of discussion about how to use them well with >SAX and DOM. > >That does imply keeping DOM out of the basic parser API, which >I still think is the best way to go. An event generator (say, >a SAX parser, or something walking a DOM tree) can have its >events filtered, and delivered to acomponent building a DOM tree. A filter can itself hold a stack of other filters, or even a set of filters to which events are routed based on some pattern. Being able to place just one filter in front of the DOM built by the parser is all you really need. Using the ModParser interface, can do the following: 1. Use setFeature to turn on DOM construction. 2. Use set to insert a filter in front of the DOM. 3. Parse a document. 4. Use get to retrieve the constructed DOM. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 12:45:44 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <007801be6a29$d6efdd80$c9a8a8c0@thing2> From: John Wilson >I don't think that it's unreasonable to insist that objects representing a >Feature, Handler or Property should either implement a distinct interface or >subclass a distinct class. If this is so the Parser can tell what Feature, >Handler or Property is being set by enquiring of the type of the object. (I >favour insisting that they subclass distinct classes because (in Java) that >naturally imposes the restriction that a single object can only represent a >single Property.) Filters often implement more than one (generally all) handler interface and then register themselves with the underlying parser/filter for the same events requested by the overlaying application/filter. Your proposal would require the filter to instantiate seperate objects for each set of events it needs to process, though it could simply pass-through the handlers for those it does not. The role and class of an object are often distinct. This was one of the things I did not like about the aggregation scheme that was proposed by Sun a while back. I think David got it right. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 13:04:46 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <08b501be6a2d$3b34c820$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Bill la Forge To: John Wilson ; XML Developers' List Sent: 09 March 1999 12:39 Subject: Re: SAX: ModSAX addition, general property query >From: John Wilson >>I don't think that it's unreasonable to insist that objects representing a >>Feature, Handler or Property should either implement a distinct interface or >>subclass a distinct class. If this is so the Parser can tell what Feature, >>Handler or Property is being set by enquiring of the type of the object. (I >>favour insisting that they subclass distinct classes because (in Java) that >>naturally imposes the restriction that a single object can only represent a >>single Property.) > > >Filters often implement more than one (generally all) handler interface and >then register themselves with the underlying parser/filter for the same events >requested by the overlaying application/filter. > >Your proposal would require the filter to instantiate seperate objects for each >set of events it needs to process, though it could simply pass-through the handlers >for those it does not. Certainly you need to instantiate an object per handler, however it need not be too ugly public class MyFilter { public final DTDHandler dtdHandler = new DTDHandler() { ... }; public final DocumentHandler documentHandler = new DocumentHandler() { ... }; .... } would seem to me to be a reasonable way of dealing with this. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Mar 9 13:15:17 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML Message-ID: <008b01be6a2e$0592cfe0$c9a8a8c0@thing2> From: David Brownell >Re other processes ... I don't think anyone's quite figured >out how to make the "open source" processes drive established >software companies. Like many leading companies, Sun is >taking steps in that direction. But at least for this year, >that isn't a useful class of processes to measure against. I suspect that a change to Open Source Software will depend on more than just vendors. Vendors need to be responsive to their customers, many of whom are still not with the new program. I don't think this process can be driven entirely from the top. It would be risky for a vendor to get to far ahead of its "community". So while open forums like XML-DEV are closer to the ideal, given the opportunity, I will be glad to participate in Sun's own process. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Mar 9 14:39:06 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:47 2004 Subject: Namespaces and DTDs References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Message-ID: <36E5322A.7DDADDD8@goon.stg.brown.edu> Ronald Bourret wrote: > > I have several DTDs with conflicting definitions of certain elements. > > ...Am I going to have to break down and just rewrite the DTDs to use > > the qualified names? > > If you want to use a namespace-unaware parser, I don't see how you can > avoid rewriting the DTDs. Maybe I misunderstand, but as far as I can see, namespaces won't help you, either. Why? Because even if you can refer to, say, your two TITLE elements by different prefixes, you'll still have to declare the prefixed elements in the DTD as if they were atomic element names. Namespaces, in other words, don't solve your problem. They may make it worse, in fact, because you have to know what prefixes you are going to declare in a given document to be able to rewrite your DTD to work with that document. There was a furor two or three months ago on this list about namespaces breaking validation. That furor died down when the namespace spec became an official recommendation (a done deal, in other words). Just so you know, though: The issue you raise is just the sort of thing that caused the furor. People were expecting namespaces to help in just your situation. When they found out that namespaces didn't help, many were disappointed, and said so. The most effective responses I saw were from people who said, in effect, "Namespaces do far less than you want or expect them to." The question is my mind is whether they actually get in the way. (You won't hear any gripes from me if my take on namespaces turns out to be dead wrong.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Mar 9 14:55:56 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:47 2004 Subject: Java Specification Request for XML In-Reply-To: <36E4FF9B.8CA35A47@eng.sun.com> References: Message-ID: <4.0.1.19990309092029.00f0f4b0@pop.hesketh.net> David Brownell wrote: >> >The Java Community Process is an open, inclusive process and we >> >look forward to the active particpation of all interested parties. If I just had to take _your_ word for it, David, I'd definitely believe it. Your continued participation on these lists and your contributions to projects like SAX and ModSAX clearly indicate that you, at least, have an open mind when it comes to open source/open process models. Unfortunately, when I visit Sun's site, and read the documentation surrounding the JCP, I'm decidedly unconvinced. Elliotte may have put Sun too deeply in the process in his description, but there's no getting around the pay to play principle that is deeply enshrined in this so-called open process. I'm glad to hear you say that it can be waived for the expert group, though it certainly wasn't clear from the Web site. (It looks like it can be waived for the first year only.) If Sun's approach involved only royalties-after-a-product-ships, I'd be a lot quieter. (I don't, after all, charge for the software I produce.) It's not, though. There are upfront fees ($5000 for non-educational entities, $2000 for non-profit or educational. (See http://developer.java.sun.com/developer/jcp/java_community_process.html for details. Most of the kickers are in the agreement, http://developer.java.sun.com/developer/jcp/JSPA.pdf) The JCP may feel like an 'open' process if you're a mammoth, or even if you're a reasonably well-off sabre-toothed tiger, but to us small mammals, it's the same old s***, different day, that we get from standards organizations. We get to run around among the mammoths and sabre-toothed tigers wearing funny lenses that blur our vision and working with tools that may not have been created with our needs in mind. The price of _joining_ the process (as a partner, where it appears you do have more influence) is even more irritating because Sun is, after all, a vendor. If I really wanted to give Sun Microsystems a sizable check, I'd expect at least a Sparc 5 with a huge monitor to show up in return. Giving Sun $5000 so this poor company can manage a not-so-open process ('Process Cost Sharing') is ridiculous. Given that $5000 pays all my expenses for a few months, the cost to small business and self-employed folks is outrageous. I'd love to participate in the process as a 'full' member, contributing time (which costs me something too), the standard currency for open source and open process participation, rather than a large sum of money that goes nowhere. I'll participate - as much as I'm allowed - but remember that the JCP is _far_ less open than the current ModSAX discussion, and I think the results of the JSR for XML are going to suffer as a result. Enough of the populist ranting. We now return to the extremely open ModSAX discussion. (p.s. It looks like David will be giving a presentation on this JSR at XTech. I'll be there, I assume he'll be there, and anyone else who's around and would like to take a close look at this thing should come by at 2:45 on Wednesday. Oh, and did I mention the price of conferences? Never mind, forget I said that.) Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 15:32:39 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:47 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> Message-ID: <36E53DA7.BF547D80@vanderburg.org> John Cowan wrote: > > > public abstract void setFeature (String featureID, boolean state) > > throws SAXNotSupportedException; > > 2) This method is allowed to throw a SAXNewParserException, which > encapsulates a replacement parser. There are two problems with this. First: let's not use exceptions to report non-error conditions. There are theoretical and practical reasons to restrict the use of Java exceptions to reporting errors. (On a related note, I would like to propose an explicit "boolean featureSupported(String featureID)" query method to make it possible to test for a feature without risking an exception. If anyone would like details of why it's bad to have exceptions as a part of normal control flow, let me know.) Second: if an application needs to implement certain features by pushing filters from the bottom, it can encapsulate the entire process on its own, using a composite, and the process never needs to be exposed through the ModSAX API. (I'm new to this discussion, so forgive me --- but let me know --- if I'm rehashing old debates.) ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 15:53:09 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query References: <007801be6a29$d6efdd80$c9a8a8c0@thing2> Message-ID: <36E540FA.61C3F574@vanderburg.org> Bill la Forge wrote: > > From: John Wilson > >I don't think that it's unreasonable to insist that objects > >representing a Feature, Handler or Property should either implement > >a distinct interface or subclass a distinct class. If this is so > >the Parser can tell what Feature, Handler or Property is being set > >by enquiring of the type of the object. > > Filters often implement more than one (generally all) handler > interface and then register themselves with the underlying > parser/filter for the same events requested by the overlaying > application/filter. Yes, and as written, John's proposal would require distinct handler objects for each feature, which would be bad. However, with a slight modification, it would work beautifully. Instead of using a string as a feature ID, use a type descriptor (in Java, an instance of java.lang.Class). Feature handlers would be registered by supplying the Class object that represents the feature being implemented, along with a handler object that is assignable to that type. It seems probable to me that, whatever naming scheme is chosen for features, each feature will have a special interface that handlers must implement; if that's true, and Strings are used to identify features, we will effectively have two names for each feature. And using classes shares one of the good aspects of the URI solution: it piggybacks on the DNS to provide a ready-made collision-free global namespace. The only problem I see with this proposal is that it may not translate well to other languages. One possibility is for other languages to use the name of the corresponding Java interface as a feature name; for example, "org.xml.sax.NamespaceHandler". This may not be ideal, but does not seem too onerous. ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 16:06:34 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:47 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <08e801be6a46$bb711d40$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: Glenn Vanderburg To: Bill la Forge Cc: John Wilson ; XML Developers' List Sent: 09 March 1999 15:40 Subject: Re: SAX: ModSAX addition, general property query >Bill la Forge wrote: >> >> From: John Wilson >> >I don't think that it's unreasonable to insist that objects >> >representing a Feature, Handler or Property should either implement >> >a distinct interface or subclass a distinct class. If this is so >> >the Parser can tell what Feature, Handler or Property is being set >> >by enquiring of the type of the object. >> >> Filters often implement more than one (generally all) handler >> interface and then register themselves with the underlying >> parser/filter for the same events requested by the overlaying >> application/filter. > >Yes, and as written, John's proposal would require distinct handler >objects for each feature, which would be bad. However, with a slight >modification, it would work beautifully. Instead of using a string >as a feature ID, use a type descriptor (in Java, an instance of >java.lang.Class). Feature handlers would be registered by supplying >the Class object that represents the feature being implemented, along >with a handler object that is assignable to that type. This seems to me to be an excellent suggestion;) John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Mar 9 16:16:49 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:09:47 2004 Subject: ModSax Suggestion References: <005001be6a23$7e574240$c9a8a8c0@thing2> Message-ID: <36E547A3.41E513A@eng.sun.com> > >> Looking forward to progress on the Java XML API. BTW, Dave, > >> are you going to do a "Birds of a Feather" session on XML at this years > >> JavaOne? I think that could be valuable. > > > >I may be signed up for more than that this time... > > > >A BOF on XML -- an XML-DEV BOF! -- would be lots of fun. > >Some of the folk here have never met in person. I think > >there will be lots of interesting applications to talk > >about ... and probably some interesting frameworks. > > Simon and I proposed a Coins BOF some time back and > JavaOne accepted it. Might be a good place to meet and > discuss ModSAX, filters, and such. I thought they were doing BOF scheduling on a more typical schedule -- e.g. hold off for a month or two before the conference. Evidently not! I'll encourage someone else to do the legwork on setting up an XML, or XML-DEV, BOF ... I'll gladly show up! It's not looking like something I'll have time to arrange. I'm sure contact information is available via the java.sun.com website. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Mar 9 17:03:20 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:47 2004 Subject: Namespaces and DTDs Message-ID: <01BE6A56.FD03D0D0@grappa.ito.tu-darmstadt.de> Richard L. Goerwitz wrote: > Maybe I misunderstand, but as far as I can see, namespaces won't help > you, either. Why? Because even if you can refer to, say, your two TITLE > elements by different prefixes, you'll still have to declare the prefixed > elements in the DTD as if they were atomic element names. > > Namespaces, in other words, don't solve your problem. They may make it > worse, in fact, because you have to know what prefixes you are going to > declare in a given document to be able to rewrite your DTD to work with > that document. > > There was a furor two or three months ago on this list about namespaces > breaking validation. That furor died down when the namespace spec became > an official recommendation (a done deal, in other words). You are correct. In today's environment (namespace-unaware parsers and no way to associate prefixes and URIs in the DTD), you must use the same prefixes in the DTD and the document for validation to work. I didn't state this because it was stated repeatedly during the aforementioned furor, which I sincerely hope this thread won't reignite. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 9 17:11:41 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> Message-ID: <36E55607.332557F1@locke.ccil.org> Glenn Vanderburg wrote: > First: let's not use exceptions to report non-error conditions. There > are theoretical and practical reasons to restrict the use of Java > exceptions to reporting errors. We should take this off-line. I'll simply say: exceptions are suitable for reporting exceptional conditions. Having an object request its own replacement is certainly exceptional. > Second: if an application needs to implement certain features by > pushing filters from the bottom, The idea here is that an application may request a feature which a parser does not itself support, but can be adapted to support by pushing a filter between itself and the application. That of course requires that the application now talk to the filter instead. (In principle, the parser could act as an adapter for the filter, but that would complicated the bejesus out of it.) In Smalltalk, the parser could swap object ids with the filter using the become: method, but AFAIK no other OO language supports that. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Tue Mar 9 17:21:20 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de> Message-ID: <36E55BFE.C5DB6816@mecomnet.de> ? which of the "namespace aware" parsers will permit you to parse validate a document for which partions of the dtd contain element declarations with ambiguous names - without first modifying the dtd? i've yet to hear a solution to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. Ronald Bourret wrote: > > Elliotte Rusty Harold wrote: > > > I have several DTDs with conflicting definitions of certain elements. > > (e.g one defines a HEAD as a TITLE followed by a META and another > > defines a HEAD as #PCDATA). I need to use all the DTDs and associated > > markup languages for a single document. > > > > To an extent I can disambiguate them with namespaces. However, is there > > any way I can do this while still validating against the orignal DTDs? > > ... > > If you want to use a namespace-unaware parser, I don't see how you can > avoid rewriting the DTDs. ...l xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 17:22:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> References: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk> Message-ID: <14053.22490.504236.874846@localhost.localdomain> John Wilson writes: > I don't think that it's unreasonable to insist that objects representing a > Feature, Handler or Property should either implement a distinct interface or > subclass a distinct class. If this is so the Parser can tell what Feature, > Handler or Property is being set by enquiring of the type of the object. (I > favour insisting that they subclass distinct classes because (in Java) that > naturally imposes the restriction that a single object can only represent a > single Property.) We wouldn't want to have to rely on discovering the class at runtime, so we'd have to have a method in the interface that reports a string ID anyway -- at lot more work for the same result. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 18:41:03 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <098d01be6a5c$57d56190$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 08 March 1999 22:30 Subject: Re: SAX: ModSAX addition, general property query >Tom Harding writes: > > David Megginson wrote: > > > > > As I wrote before, it doesn't much matter whether we use Java property > > > names incorporating domain names (like > > > 'org.xml.sax.features.validation') or URIs (like > > > 'http://xml.org/sax/features/validation'), as long as we have the > > > ability for people to create new names without fear of collision. > > > > I would also urge against using an http: URI since it is not meant > > that a resource actually be retrieved using the http protocol. > >I've been thinking about this issue, and I'm fairly convinced that the >URI is the right choice. I really have a problem with using URI's for this. RFC2396 (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt) section 6 talks about URI Normalisation and equivalence It says that URI equivalence is defined on a scheme basis. You have chosen the http scheme so we are presumably required to apply the http definition of URI equivalence. This does not seem to me to be a desirable criteria for equivalence. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 18:41:04 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: John Wilson Cc: XML Developers' List Sent: 09 March 1999 17:19 Subject: Re: SAX: ModSAX addition, general property query >John Wilson writes: > > > I don't think that it's unreasonable to insist that objects representing a > > Feature, Handler or Property should either implement a distinct interface or > > subclass a distinct class. If this is so the Parser can tell what Feature, > > Handler or Property is being set by enquiring of the type of the object. (I > > favour insisting that they subclass distinct classes because (in Java) that > > naturally imposes the restriction that a single object can only represent a > > single Property.) > >We wouldn't want to have to rely on discovering the class at runtime, >so we'd have to have a method in the interface that reports a string >ID anyway -- at lot more work for the same result. Testing the type at run time is a tivial operation in Java so I'm not sure why you say that we wouldn't want to rely on descovering the class at run time. If there was some worry about the performance hit on iterating through all the supported interfaces (which I strongly doubt) the interface would have a method that reported a Class rather than a String. However, Glen Vanderburg has suggested an amendment to my idea which seems to me to address you concerns. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Mar 9 19:36:41 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:48 2004 Subject: RDF, ID's, XPtrs, and object orientation Message-ID: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com> I am struggling with the following limitation caused by RDF's use of ID attributes: I want to use RDF in a truly object-oriented fashion. It lets me get really close but not quite there. I would like to use the "subPropertyOf" element to indicate overriding. However, since property names are ID's, I can't override by name. I could use XPointer to refer to overridden names (in effect referring to "the property whose name is foo and whose class is bar"), but I can't actually define the bar version of foo and the baz version of foo in the same document. Of course, if I could specify a key composed of multiple attributes, my problems would be solved. I realize I can also avoid the problem by putting each "class" in a separate document, but this causes problems of its own in my particular application. If anyone has a hint as to how to get around this issue, that would be great, otherwise it's just food for thought. Jeff P.S. I am finding the problem of ID conflicts between "fragments" that need to be created separately and then combined into a single document to be a general one. My approach has been not to use ID attributes, but I don't have a choice if I'm using RDF. I suppose it will work as long as I don't validate, but I really want to validate. ----------------------------------------------------------------- Kuantech, Inc. http://www.kuantech.com Jeffrey E. Sussna, Principal jes@kuantech.com Distributed Content Architectures for Dynamic Online Applications ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From glv at vanderburg.org Tue Mar 9 19:49:19 1999 From: glv at vanderburg.org (Glenn Vanderburg) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> <36E55607.332557F1@locke.ccil.org> Message-ID: <36E57A78.87779A2C@vanderburg.org> > We should take this off-line. I'll simply say: exceptions are > suitable for reporting exceptional conditions. Having an object > request its own replacement is certainly exceptional. Well, yes and no. But I'd prefer to go the cleaner route of not allowing the object to request its own replacement. > The idea here is that an application may request a feature which > a parser does not itself support, but can be adapted to support > by pushing a filter between itself and the application. Yes, I understand. > That > of course requires that the application now talk to the filter > instead. (In principle, the parser could act as an adapter for > the filter, but that would complicated the bejesus out of it.) It's not complicated at all --- merely a little tedious. It would be easy to provide a class in the helpers package that would make it almost trivial. My primary objection to the idea is precisely what you mentioned above: that it is an extremely unusual thing to happen. Programmers will be surprised by this behavior. Coupled with the fact that it's very easy to make it all transparent, I think exposing the parser's internal tricks is a bad idea. ---glv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 19:52:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <36E540FA.61C3F574@vanderburg.org> References: <007801be6a29$d6efdd80$c9a8a8c0@thing2> <36E540FA.61C3F574@vanderburg.org> Message-ID: <14053.31455.289569.926503@localhost.localdomain> Glenn Vanderburg writes: > It seems probable to me that, whatever naming scheme is chosen for > features, each feature will have a special interface that handlers > must implement This is not the case. Some features will require special handlers, some will allow special handlers, and some will simply change the way existing handlers are used. For example, if you enable validation, you request that the parser report additional error states the existing ErrorHandler; if you enable text-normalisation, you simply ask the parser to guarantee that there will never be two DocumentHandler.characters events in a row. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 19:53:20 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E53DA7.BF547D80@vanderburg.org> References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> Message-ID: <14053.31616.491923.652158@localhost.localdomain> Glenn Vanderburg writes: > Second: if an application needs to implement certain features by > pushing filters from the bottom, it can encapsulate the entire process > on its own, using a composite, and the process never needs to be > exposed through the ModSAX API. This is actually a good point. Since the SAX driver is usually a separate class rather than the parser itself, it would not be difficult for it to encapsulate any needed filters. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Mar 9 20:18:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query In-Reply-To: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> References: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk> Message-ID: <14053.33072.300457.335320@localhost.localdomain> John Wilson writes: > Testing the type at run time is a tivial operation in Java ... but not in other programming languages. > so I'm not sure why you say that we wouldn't want to rely on > descovering the class at run time. In the end, you're doing the equivalent of testing for a string anyway -- you're just letting the Java class name serve as the unique ID. I don't see the advantage of forcing the users to get the unique ID through a circuitous route. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Mar 9 21:24:42 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:48 2004 Subject: SAX: ModSAX addition, general property query Message-ID: <09b201be6a73$243c7dc0$010a0a0a@home.wilson.co.uk> ----- Original Message ----- From: David Megginson To: XML Developers' List Sent: 09 March 1999 20:16 Subject: Re: SAX: ModSAX addition, general property query >John Wilson writes: > > > Testing the type at run time is a tivial operation in Java > >... but not in other programming languages. > > > so I'm not sure why you say that we wouldn't want to rely on > > descovering the class at run time. > >In the end, you're doing the equivalent of testing for a string anyway >-- you're just letting the Java class name serve as the unique ID. I >don't see the advantage of forcing the users to get the unique ID >through a circuitous route. You are testing for a value. Testing for a String, a Class or an int are, at that level, equivalent The issue is: how do you chose the value? It so happens that Java provides a natural way for us to create a unique value. Other languages provide other ways of creating the unique value. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Tue Mar 9 21:56:33 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs Message-ID: <9C998CDFE027D211B61300A0C9CF9AB4424719@SNYC11309> Richard L. Goerwitz wrote: >Ronald Bourret wrote: >> > I have several DTDs with conflicting definitions of certain elements. >> > ...Am I going to have to break down and just rewrite the DTDs to use >> > the qualified names? >> >> If you want to use a namespace-unaware parser, I don't see how you can >> avoid rewriting the DTDs. >Maybe I misunderstand, but as far as I can see, namespaces won't help >you, either. Why? Because even if you can refer to, say, your two TITLE >elements by different prefixes, you'll still have to declare the prefixed >elements in the DTD as if they were atomic element names. >Namespaces, in other words, don't solve your problem. They may make it >worse, in fact, because you have to know what prefixes you are going to >declare in a given document to be able to rewrite your DTD to work with >that document. I have a similar problem: On my web site http://www.cogx.com, I am working on XML driven menu bar (can be a tree, etc) The underlying XML uses reusable structures such as months, quarters of the year, Tax schedules with zillions of tax lines repeated, etc. Instead of having just one XML document for the menu bar, I moved reusable fragments into a separate file and access them from my main XML by or it is also obvious that I should not keep all fragments in one reusable collection, but rather separate them by theme. - Why should I send file with tax schedules to a guy interested in Opera performances? So I can have as many reusable collections as I wish: tax related, publications related, theater related, etc... It means I should allow freedom in specifying namespace prefixes and still know what each prefix means! I am achieving this by declaring my namespaces as follows: xmlns:ref="groups:www.cogx.com/xmlbar/ref-menu.xml" the prefix "groups:" tells me that a namespace of reusable fragments was defined Now I can give my prefix any name. When parsing I know that it is a namespace of reusable fragments! Problem here is that element has to be defined with an open model to allow for different namespace prefixes. I also made a proposal that it would be great to reserve "any" prefix for this type of situation. This will save me from using open model, which I do not like, really! > The most effective responses I saw were >from people who said, in effect, "Namespaces do far less than you want >or expect them to." Exactly! And this is why Namespaces let you do much more then you thought you can! Best regards, Nikita O. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Tue Mar 9 22:49:16 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:48 2004 Subject: Namespaces and DTDs Message-ID: A simple extension to namespaces could have fixed this problem: 1. Allow a DTD to be optionally specified along with the namespace prefix and URI 2. When an element is prefixed, parse it using the DTD associated with the namespace and the given prefix as the default. 3. If no DTD is associated with the prefix or not validating, do what is done now (ensure element is well-formed). Your DTDs would not need to be changed, you would just have to indicate which HEAD (for example) is desired in the content and add associated DTD urls to the namespace declarations. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] Sent: Tuesday, March 09, 1999 9:02 AM To: xml-dev@ic.ac.uk Subject: RE: Namespaces and DTDs Richard L. Goerwitz wrote: > Maybe I misunderstand, but as far as I can see, namespaces won't help > you, either. Why? Because even if you can refer to, say, your two TITLE > elements by different prefixes, you'll still have to declare the prefixed > elements in the DTD as if they were atomic element names. > > Namespaces, in other words, don't solve your problem. They may make it > worse, in fact, because you have to know what prefixes you are going to > declare in a given document to be able to rewrite your DTD to work with > that document. > > There was a furor two or three months ago on this list about namespaces > breaking validation. That furor died down when the namespace spec became > an official recommendation (a done deal, in other words). You are correct. In today's environment (namespace-unaware parsers and no way to associate prefixes and URIs in the DTD), you must use the same prefixes in the DTD and the document for validation to work. I didn't state this because it was stated repeatedly during the aforementioned furor, which I sincerely hope this thread won't reignite. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Mar 9 23:06:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:48 2004 Subject: RDF, ID's, XPtrs, and object orientation References: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com> Message-ID: <36E5A929.394204F0@locke.ccil.org> Jeffrey E. Sussna wrote: > My approach has been not to use ID attributes, but I don't have a > choice if I'm using RDF. I suppose it will work as long as I don't > validate, but I really want to validate. Actually, the values of ID and bagID attributes have to be unique within the document, but nothing says that either has to be an XML "ID attribute". (That was so in earlier drafts, but not now.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:12:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:48 2004 Subject: ModSAX: Proposed Core Handlers Message-ID: <14053.50619.147376.869177@localhost.localdomain> My current proposal for the ModParser interface includes the following method (ModHandler is an empty interface): public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; I propose the following core handlers, with the understanding that SAX parsers are not required to support any of them (they are free to throw a SAXNotSupportedException): ModSAX Core Handlers -------------------- (All handler IDs correspond to a specific interface.) http://xml.org/sax/handlers/lexical Receive callbacks for comments, CDATA sections, and (possibly) entity references. http://xml.org/sax/handlers/dtd-decl Receive callbacks for element, attribute, and (possibly) parsed entity declarations. http://xml.org/sax/handlers/namespace Receive callbacks for the start and end of the scope of each namespace declaration. I'm not certain, but it might make sense to replace the third one with a read-only parse-time property. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:17:07 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <14053.50863.546824.628181@localhost.localdomain> My current proposal for the ModParser interface includes the following methods: public abstract void set (String propID, Object value) throws SAXNotSupportedException; public abstract Object get (String propID); throws SAXNotSupportedException; Properties may be read-write, read-only, or write-only; they may also be parse-time (may be changed during parsing) or non-parse-time (may be changed only before a parse or between parses). ModSAX Core Properties ---------------------- (All properties are associated with a single value type.) http://xml.org/sax/properties/namespace-sep (write-only) Set the separator to be used between the URI part of a name and the local part of a name when namespace processing is being performed (see the http://xml.org/sax/features/namespaces feature). By default, the separator is a single space. This property may not be set while a parse is in progress (throws a SAXNotSupportedException). http://xml.org/sax/properties/dom-node (read-only) Get the DOM node currently being visited, if the SAX parser is iterating over a DOM tree. If the parser recognises and supports this property but is not currently visiting a DOM node, it should return null (this is a good way to check for availability before the parse begins). http://xml.org/sax/properties/xml-string (read-only) Get the literal string of characters associated with the current event. If the parser recognises and supports this property but is not currently parsing text, it should return null (this is a good way to check for availability before the parse begins). I stole this idea from Expat. Remember that no SAX parser will be required to support any of these -- it simply has to throw a SAXNotSupportedException if it doesn't know about the property. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:17:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Features Message-ID: <14053.51113.676945.877507@localhost.localdomain> Here's my revised version of the core feature list, based on recent discussions: ModSAX Core Features -------------------- http://xml.org/sax/features/validation Validate (true) or don't validate (false). http://xml.org/sax/features/external-general-entities Expand external general entities (true) or don't expand (false). http://xml.org/sax/features/external-parameter-entities Expand external parameter entities (true) or don't expand (false). http://xml.org/sax/features/namespaces Preprocess namespaces (true) or don't preprocess (false). See also the http://xml.org/sax/properties/namespace-sep property. http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 01:21:08 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed ModParser Interface Message-ID: <14053.51158.347156.718466@localhost.localdomain> Here's my current proposed interface for ModParser: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; public abstract void set (String propID, Object value) throws SAXNotSupportedException; public abstract Object get (String propID) throws SAXNotSupportedException; } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Wed Mar 10 01:24:06 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs In-Reply-To: Your message of "Tue, 09 Mar 1999 14:48:18 -0800." Message-ID: <199903100123.OAA10814@aniwa.sky> How about having the ability to say 'process the children of this element using that dtd'. Attach DTD declarations to elements, not just to documents. It feels like some way is needed to make interpretation of XML subtrees dependent on context, hence not requiring the rewriting of XML imported into a document as a subtree from the context of a different document. (Perhaps I'm being naive. I'm new to this.) Andrew McNaughton > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > > Your DTDs would not need to be changed, you would just have to > indicate which HEAD (for example) is desired in the content and add > associated DTD urls to the namespace declarations. > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] > Sent: Tuesday, March 09, 1999 9:02 AM > To: xml-dev@ic.ac.uk > Subject: RE: Namespaces and DTDs > > Richard L. Goerwitz wrote: > > > Maybe I misunderstand, but as far as I can see, namespaces won't > help > > you, either. Why? Because even if you can refer to, say, your two > TITLE > > elements by different prefixes, you'll still have to declare the > prefixed > > elements in the DTD as if they were atomic element names. > > > > Namespaces, in other words, don't solve your problem. They may make > it > > worse, in fact, because you have to know what prefixes you are going > to > > declare in a given document to be able to rewrite your DTD to work > with > > that document. > > > > There was a furor two or three months ago on this list about > namespaces > > breaking validation. That furor died down when the namespace spec > became > > an official recommendation (a done deal, in other words). > > You are correct. In today's environment (namespace-unaware parsers > and no > way to associate prefixes and URIs in the DTD), you must use the same > prefixes in the DTD and the document for validation to work. I didn't > state this because it was stated repeatedly during the aforementioned > furor, which I sincerely hope this thread won't reignite. > > -- Ron Bourret > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Mar 10 01:30:01 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:49 2004 Subject: Architectural Forms Questions References: <36E3F30C.F6D6DB51@mitre.org> Message-ID: <36E5BA4C.7916D8@prescod.net> "Roger L. Costello" wrote: > > - How powerful is the correspondence that you can express with > Architectural Forms? Is it essentially limited to renaming and > omission? You can also map elements to attributes and attributes to elements. > - In addition to using Architectural Forms to express correspondences > that are known a priori, could you use them to document mappings that > are discovered "on-the-fly" by modifying a document or DTD after a > mapping is discovered? Yes, you can do this by modifying DTDs. Caveat: In my experience it is seldom the case that a subtype relationship can be "discovered" after the fact. It works for really loose DTDs like HTML and ICADD, but not for more complex/strict DTDs. This is very similar to the situation in software development. It is very rarely the case that you can "adapt" an existing class to a newly discovered supertype without radically changing the class or breaking existing code. > - It appears to be the case that the correspondence between A and B must > be documented in a way that keeps the mapping tightly coupled to either > A or B. Are there any plans to represent the correspondence so that it > is not tightly coupled to either A or B? You could think of this as the distinction between subtyping and transformation. Subtyping is about an inherent relationship that is discovered in advance. Transformation is about imposing a mapping externally, "on the fly." > - Is it a correct interpretation to say that Architectural Forms > represent correspondence by overloading existing language constructs? "Overloading" is a somewhat overloaded term. Let's say "reusing" existing language constructs. > - Given that subtyping and inheritance have been part of the primary XML > "schema" proposals, is it likely that XML Architectural Forms will be > overtaken by advances in the XML schema area? Eventually. In what time frame, I don't know. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Wed Mar 10 01:31:13 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs Message-ID: Exactly. By using you would be saying process the HEAD element according to the DTD associated with the namespace prefix 'a' and consider 'a' to be the default namespace for the DTD. If there is no associated DTD, can only check HEAD is well-formed. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Andrew McNaughton [SMTP:andrew@squiz.co.nz] Sent: Wednesday, March 10, 1999 6:23 AM To: Marc McDonald Cc: xml-dev@ic.ac.uk; rbourret@ito.tu-darmstadt.de Subject: Re: Namespaces and DTDs How about having the ability to say 'process the children of this element using that dtd'. Attach DTD declarations to elements, not just to documents. It feels like some way is needed to make interpretation of XML subtrees dependent on context, hence not requiring the rewriting of XML imported into a document as a subtree from the context of a different document. (Perhaps I'm being naive. I'm new to this.) Andrew McNaughton > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > > Your DTDs would not need to be changed, you would just have to > indicate which HEAD (for example) is desired in the content and add > associated DTD urls to the namespace declarations. > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de] > Sent: Tuesday, March 09, 1999 9:02 AM > To: xml-dev@ic.ac.uk > Subject: RE: Namespaces and DTDs > > Richard L. Goerwitz wrote: > > > Maybe I misunderstand, but as far as I can see, namespaces won't > help > > you, either. Why? Because even if you can refer to, say, your two > TITLE > > elements by different prefixes, you'll still have to declare the > prefixed > > elements in the DTD as if they were atomic element names. > > > > Namespaces, in other words, don't solve your problem. They may make > it > > worse, in fact, because you have to know what prefixes you are going > to > > declare in a given document to be able to rewrite your DTD to work > with > > that document. > > > > There was a furor two or three months ago on this list about > namespaces > > breaking validation. That furor died down when the namespace spec > became > > an official recommendation (a done deal, in other words). > > You are correct. In today's environment (namespace-unaware parsers > and no > way to associate prefixes and URIs in the DTD), you must use the same > prefixes in the DTD and the document for validation to work. I didn't > state this because it was stated repeatedly during the aforementioned > furor, which I sincerely hope this thread won't reignite. > > -- Ron Bourret > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 10 02:40:06 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: Hi David, In a message dated 3/9/99 8:30:31 PM Eastern Standard Time, david@megginson.com writes: > http://xml.org/sax/properties/dom-node (read-only) > Get the DOM node currently being visited, if the SAX parser is > iterating over a DOM tree. If the parser recognises and supports > this property but is not currently visiting a DOM node, it should > return null (this is a good way to check for availability before the > parse begins). > This has made me realize that I was under a misconception about what the generic get() and set() parser properties would provide in terms of functionality. What I was really hoping for was: org.w3c.dom.Document parse(InputSource is, boolean events) throws SAXException; org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws SAXException; /* the events boolean would be to turn on/off event calls. */ Which would allow me to code: try { ModParser mp = ParserFactory.makeModParser(); boolean supported = true; try { mp.setFeature("http://xml.org/sax/features/dom-result", true); } catch (SAXNotSupportedException snse) { supported = false; } if (supported) { Document d = mp.parse("test.xml", false); // ... process Document } } catch (SAXException se) { // handle it } So, what I'm saying is that I would like to be able to choose whether to interface to the Parser via events or via a DOM. If you agree with this, I believe using the return type is more appropriate than getting a resultant property (as I suggest next). If for some reason the above is not palatable, the same could be accomplished under the current scheme if we added a property: http://xml.org/sax/properties/dom-document (read-only) Then I could code: try { ModParser mp = ParserFactory.makeModParser(); boolean supported = true; try { mp.setFeature("http://xml.org/sax/features/dom-capable", true); } catch (SAXNotSupportedException snse) { supported = false; } if (supported) { mp.parse("test.xml"); Document d = (Document) mp.get("http://xml.org/sax/properties/dom- document"); // ... process Document } } catch (SAXException se) { // handle it } Note: both code examples also required an added feature to check for the desired functionality. I believe the above is sorely missing from the current API. Does anyone else see a need for this? If not, why not? But before you say, "build a layer on top of SAX" -- to me that seems ridiculous when most of the Parser implementations can produce a dom Document. Best wishes, - Mike (mdaconta@aol.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 04:23:41 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:49 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <013601be6aac$f96ad580$c9a8a8c0@thing2> From: MikeDacon@aol.com >This has made me realize that I was under a misconception about >what the generic get() and set() parser properties would provide in >terms of functionality. What I was really hoping for was: > >org.w3c.dom.Document parse(InputSource is, boolean events) throws >SAXException; >org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws >SAXException; >/* the events boolean would be to turn on/off event calls. */ I think you have this capability without the extra parameter, since you don't get events unless you register a handler to receives them. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From GAjitK at dbss.com Wed Mar 10 08:38:28 1999 From: GAjitK at dbss.com (George, Ajit Kumar (CTS)) Date: Mon Jun 7 17:09:49 2004 Subject: No subject Message-ID: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA> Hi, I am new to the XML and Java. I am trying to display a XML document in a tree structure using XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements, but how do I get the text content out of the element So I do have a NodeList and I am able to iterate through it, but I am not able to figure out a way to get the content information out of it. I could appreciate any help in this. I will not be using Microsoft parser classes. regards Ajit GAjitK@dbss.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From leich at wiwi.uni-marburg.de Wed Mar 10 09:04:02 1999 From: leich at wiwi.uni-marburg.de (Steffen Leich) Date: Mon Jun 7 17:09:49 2004 Subject: your mail In-Reply-To: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA> Message-ID: On Wed, 10 Mar 1999, George, Ajit Kumar (CTS) wrote: > Hi, > > I am new to the XML and Java. I am trying to display a XML document in a > tree structure using > XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements, > but how do I > get the text content out of the element > > So I do have a NodeList and I am able to iterate through it, but I am not > able to figure out a > way to get the content information out of it. > > I could appreciate any help in this. I will not be using Microsoft parser > classes. > Hi, check out the following URLs: http://www.software.ibm.com/xml/education/buildappl/xml_to_html.html http://www.alphaworks.ibm.com/forum/xmlforjava.nsf/discussion_vert (Discussion of and Links to Tutorials) http://developerlife.com/xmljavatutorial1 Steffen ___________________________________________________ Steffen Leich Phone: +49-6421-283144 leich@wiwi.uni-marburg.de Universitaet Marburg Informations- und Kommunikationsdienste FB 02 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 10 09:16:04 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:49 2004 Subject: Namespaces and DTDs Message-ID: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> james anderson wrote: > ? which of the "namespace aware" parsers will permit you to parse validate a > document for which partions of the dtd contain element declarations with > ambiguous names - without first modifying the dtd? i've yet to hear a solution > to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. Good point -- it was unfair of me to blame the parsers here. It all seems rather obvious now: Q. Why were namespaces invented? A. To disambiguate duplicate names. Q. I have a DTD with duplicate names. How do I disambiguate them? A. Use namespaces. The only inobvious bit is that, because there is no way to declare namespaces in the DTD, you can't declare different default namespaces for different parts of the DTD, which would have solved Elliotte's problem rather neatly. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Wed Mar 10 09:19:07 1999 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:09:49 2004 Subject: Architectural Forms Questions In-Reply-To: <36E5BA4C.7916D8@prescod.net> References: <36E3F30C.F6D6DB51@mitre.org> Message-ID: <3.0.6.32.19990310090702.0097ce90@gpo.iol.ie> >"Roger L. Costello" wrote: > - Given that subtyping and inheritance have been part of the primary XML > "schema" proposals, is it likely that XML Architectural Forms will be > overtaken by advances in the XML schema area? > I believe and hope this is true. The mapping that AFs enable is too limiting in my experience. Case in point: at XML 98 in Chicago the GCA issued a DTD for paper submissions. I wrote a paper for that confernence using XML. Along comes XML Europe 99 a variation on the DTD for paper submissions. Even this mapping between two DTDs from the same broad organization in the same ballpark of document types cannot be done with AFs. At least not with my cerebral cortex. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From l-arcini at uniandes.edu.co Wed Mar 10 10:02:51 1999 From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.) Date: Mon Jun 7 17:09:50 2004 Subject: Req:Music DTD(?) Message-ID: <36E644E5.6E728D40@uniandes.edu.co> Hello to all, I'm currently working on a xml-based sequencer, and I would like to see some music notation DTDs, before I start to write my own. I've searched the web high and low... no luck so far, so and I was wondering if any of you guys have any pointer I could use. Thanks in advance Fabio -- Fabio Arciniegas A. Ingenieria de Sistemas Uniandes xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From reschke at medicaldataservice.de Wed Mar 10 10:23:53 1999 From: reschke at medicaldataservice.de (Julian Reschke) Date: Mon Jun 7 17:09:50 2004 Subject: XML query engines Message-ID: <000d01be6ae0$8a0da080$2e00a8c0@julian> At Sun, 31 Jan 1999 16:53:32 -0800, Tim Bray (tbray@textuality.com) wrote: >At 08:24 PM 1/31/99 +73900, John Cowan wrote: >>Assign a sequentially increasing number to each *tag* (start-tag or end-tag) >>in the document, treating an empty tag as a start-tag followed by an >>end-tag. Then e1 is a descendant of e2 iff e1.start > e2.start >>and e1.end < e2.end. Also, e1 is a left sibling of e2 (and e2 is >>a right sibling of e1) iff e1.end + 1 = e2.start; e1 is the leftmost >>child of e2 iff e1.start = e2.start + 1. Modeling the child/parent >>relationship is not so easy, and requires iteration. > >This structure has all sorts of advantages; that's how the >Open Text SGML-savvy search engine of yore used to run. Fast as >ell, equal access to any & all elements without performance >penalty. > > >But hard to update. Is there an easy way to apply this model to a MSXML.DLL DOM object? Microsoft's documentation (uniqueID Method, elementIndexList Method) is not very clear about how these IDs are generated, and whether they remain the same across to separate parser invocations on the same XML data... -- Julian Reschke MedicalData Service GmbH (http://www.medicaldataservice.de) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 10:34:43 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: Message-ID: <36E64E4F.83649621@mecomnet.de> That "REC-xml-names-19990114" does not provide any means to establish prefix<->uri bindings for a DTD has long been a point of contention. A cursory search of the archives will bear this out. The decision to eliminate the combined prefix/uri/dtd binding (the original pi form) was, however, correct, as the pi form, at least as proposed in "WD-xml-names-19980327", would not have been sufficient to handle such things as a dtd which needs multiple prefix bindings or the situation where a given prefix<->uri binding is to apply to multiple schema sources. While it is true that some mechanism is necessary, a form - as discussed below - which effected a singular binding would also not have solved the problem. "Everyone" would seem to be waiting for "schemas".... Marc.McDonald@Design-Intelligence.com wrote: > > A simple extension to namespaces could have fixed this problem: > 1. Allow a DTD to be optionally specified along with the namespace > prefix and URI > 2. When an element is prefixed, parse it using the DTD associated with > the namespace and the given prefix as the default. > 3. If no DTD is associated with the prefix or not validating, do what > is done now (ensure element is well-formed). > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 11:30:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Properties In-Reply-To: References: Message-ID: <14054.21854.948934.185758@localhost.localdomain> MikeDacon@aol.com writes: > So, what I'm saying is that I would like to be able to choose > whether to interface to the Parser via events or via a DOM. If you > agree with this, I believe using the return type is more > appropriate than getting a resultant property (as I suggest next). This is easy enough to build on top of SAX, but I think that it's probably out of scope for SAX itself. SAX is meant to be a relatively simple, low-level layer that people can build on. > If for some reason the above is not palatable, the same could be > accomplished under the current scheme if we added a > property: > > http://xml.org/sax/properties/dom-document (read-only) The nice thing about ModSAX is that you're free to try this yourself -- just define a property like http://www.aol.com/mdaconta/props/dom-document (or whatever URL you can use based on your AOL account) and let the market decide whether to support it. Perhaps one of the people who has written a higher-level utility package that supports both SAX and DOM would like to use this or something like it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Wed Mar 10 12:05:42 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:50 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: <36E4C4E6.B51DDFF3@eng.sun.com> References: <14051.3215.196642.22571@localhost.localdomain> <36E4C4E6.B51DDFF3@eng.sun.com> Message-ID: * David Megginson | | http://xml.org/sax/features/normalize-text * David Brownell | | This is a good filter feature, I think. I agree. | Lars suggested a "Catalog" feature. There are different sorts of | catalog, and they need configuration, so the value of this could be | a URI for the catalog, not just a boolean. There should be a catalog parameter as well, but the reason I proposed this as a feature rather than just as a parameter is that SP and xmlproc both allow you to use environment variables to point to a default catalog file, which is rather handy. So it would definitely be useful to be able to tell the parser, go read the default catalog, wherever it is. (Or don't.) Java parsers could use a Java property to achieve the same thing. BTW: I'm surprised that David Megginson hasn't replied to this. David, Some kind of confirmation that you've at least seen this would be welcome. (I know majordomo isn't 100% trustworthy, so it might have disappeared on the way.) | Plus, this would seem to be up to the "EntityResolver" to handle | ... yes? Sort of. You could make a parser filter that used an entity resolver to do this in general. xmlproc has an internal PubIdResolver interface which it uses for this (and which is also exposed as the EntityResolver when using SAX). | It'd perhaps suggest that one could ask the next filter in the | stream for the resolver it was using ... :-) Hmmm. This is actually potentially troubling, since one would need to specify how a catalog EntityResolver and a custom one specified to be used together should work. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Wed Mar 10 12:21:56 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Properties Message-ID: <4449a8bc.36e66305@aol.com> Hi Bill, In a message dated 3/9/99 11:37:51 PM Eastern Standard Time, b.laforge@jxml.com writes: > >org.w3c.dom.Document parse(InputSource is, boolean events) throws > >SAXException; > >org.w3c.dom.Document parse(java.lang.String uri, boolean events) throws > >SAXException; > >/* the events boolean would be to turn on/off event calls. */ > > > I think you have this capability without the extra parameter, since you don't > get events unless you register a handler to receives them. > Since there is already a parse(InputSource) and parse(String) method in the interface, in order to overload it we need a second parameter. The events parameter was the first one that came to mind, there may be a better one. Best wishes, - Mike Mike Daconta (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Mar 10 13:16:35 1999 From: richard at goon.stg.brown.edu (Richard Goerwitz) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E6704E.A13B3890@goon.stg.brown.edu> Ronald Bourret wrote: > The only inobvious bit is that, because there is no way to declare > namespaces in the DTD, you can't declare different default namespaces > for different parts of the DTD Because the DTD is not namespace aware, all it can deal with are the pre- fixes you declare (not the URLs associated with them). Since these pre- fixes are declared in the document content, you end up with a peculiar situation in which the DTD has to be written according to declarations in a given document instance, rather than the reverse. Worse yet, there is no way to be sure that the various documents being validated against a particular DTD use the prefixes correctly, with the correct URLs, un- less you make extensive use of attribute defaults - which, ironically, means we now need the DTD (probably an external one, typically with a bunch of parameter entities; so get your validating parser ready). After another year or two of this, with alternate schemas floating around besides DTDs, with architectural forms, with namespaces, and what not - after all of this, I wonder if we'll all, in good conscience, be able to say that anything has been simplified. (Simplicity _was_ one of XML's primary goals back in the dark ages last February.) In reality, XML is functioning less like a "simplification," and more like a political move intended to facilitate changes that could never have been made to a mature standard like SGML. This is actually a very old story that's been repeated many times over. (Just look at what's happened to LDAP. By the time we get all the PKI and ACL extensions in place, it's really not going to be very L.) In the end, LDAP and XML may end up serving their constituencies better than their predecessors did. Or they may not. Frankly, with regard to XML, the jury is still out. It's not catching on nearly as fast as pre- dicted a year or two ago. And it's taking considerably more work to im- plement it than anybody ever envisioned. Those of us who have done the work of writing XML processing software, and of making it work, have a right to say this. The emperor may or may not have clothes. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Wed Mar 10 13:36:47 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:50 2004 Subject: Req:Music DTD(?) In-Reply-To: Your message of "Wed, 10 Mar 1999 05:09:41 CDT." <36E644E5.6E728D40@uniandes.edu.co> Message-ID: <199903101333.CAA06775@aniwa.sky> > Hello to all, > I'm currently working on a xml-based sequencer, and I would like to see > some music notation DTDs, before I start to write my own. I've searched > the web high and low... no luck so far, so and I was wondering if any of > you guys have any pointer I could use. You need a new search engine. I've recently been using www.google.com with results an order of magitude better than what I got from altavista (though altavista still has it's place for more complex query definitions). Try this url: http://www.googlebot.com/search?q=music+dtd Andrew McNaughton Disclaimer: I have nothing to do with google.com, I'm just impressed by their service -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 14:24:01 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E683F7.429E4B25@mecomnet.de> yes; agreement on all points. mr. harold is not the only one who would have benefitted. the only aspect of which i can comprehend, is the claim, that, being able to bind the prefixes over a dtd would have broken the rule that namespaces should not "change the validity of a given document". which claim is true, but which i believe to be fundamentally misdirected. it's an old argument. Ronald Bourret wrote: > > james anderson wrote: > > > ? which of the "namespace aware" parsers will permit you to parse > validate a > > document for which partions of the dtd contain element declarations with > > ambiguous names - without first modifying the dtd? i've yet to hear a > solution > > to the "ambiguous name" problem for xml-1.0/+ns conforming parsers. > > Good point -- it was unfair of me to blame the parsers here. It all seems > rather obvious now: > > Q. Why were namespaces invented? > A. To disambiguate duplicate names. > > Q. I have a DTD with duplicate names. How do I disambiguate them? > A. Use namespaces. > > The only inobvious bit is that, because there is no way to declare > namespaces in the DTD, you can't declare different default namespaces for > different parts of the DTD, which would have solved Elliotte's problem > rather neatly. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 14:52:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: SAX RFD: ModSAX Predefined Features In-Reply-To: References: <14051.3215.196642.22571@localhost.localdomain> <36E4C4E6.B51DDFF3@eng.sun.com> Message-ID: <14054.34184.693965.347827@localhost.localdomain> Lars Marius Garshol writes: > | Lars suggested a "Catalog" feature. There are different sorts of > | catalog, and they need configuration, so the value of this could be > | a URI for the catalog, not just a boolean. > > There should be a catalog parameter as well, but the reason I proposed > this as a feature rather than just as a parameter is that SP and > xmlproc both allow you to use environment variables to point to a > default catalog file, which is rather handy. > > So it would definitely be useful to be able to tell the parser, go > read the default catalog, wherever it is. (Or don't.) Java parsers > could use a Java property to achieve the same thing. > > BTW: I'm surprised that David Megginson hasn't replied to this. > David, Some kind of confirmation that you've at least seen this > would be welcome. (I know majordomo isn't 100% trustworthy, so it > might have disappeared on the way.) Please don't be surprised -- depending on how new a suggestion is, sometimes I like to sit back and hear different people's opinions for a few hours or a few days before blurting out my own. On this topic, I'm a little uncomfortable putting in a core feature for catalogues when XML catalogue formats haven't settled yet (likewise, I don't include a feature for data typing, though some kind of data typing will undoubtedly arrive before long). It would probably make more sense for the promoters of different catalogue formats to define their own properties and/or features, such as http://www.oasis.org/sax/features/entity-catalog That way, we won't have any unpleasant surprises when a user expects a parser to use one type of catalogue and the parser finds another instead. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Mar 10 15:07:12 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features Message-ID: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> At 08:16 PM 3/9/99 -0500, David Megginson wrote: >Here's my revised version of the core feature list, based on recent >discussions: This seems to be converging nicely. Any chance of losing the ugly "Mod" prefix? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Mar 10 15:09:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> References: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> Message-ID: <14054.35485.843066.25717@localhost.localdomain> Tim Bray writes: > At 08:16 PM 3/9/99 -0500, David Megginson wrote: > >Here's my revised version of the core feature list, based on recent > >discussions: > > This seems to be converging nicely. Any chance of losing the > ugly "Mod" prefix? -Tim Yeah, no one seems to like it but me. Any other suggestions? I don't like Parser2 or things like that, because I want to emphasise that this is an add-on to SAX 1.0 rather than an upgrade. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mgoulde at psgroup.com Wed Mar 10 16:24:26 1999 From: mgoulde at psgroup.com (Michael Goulde) Date: Mon Jun 7 17:09:50 2004 Subject: Music DTD(?) Message-ID: <71A71A050B7BD111838300805F579504926776@psgroup.com> Check out: http://www.tcf.nl/3.0/musicml/index.html Michael Goulde Executive Vice President Research and Services Patricia Seybold Group 85 Devonshire St., 5th Floor Boston, MA 02109 Tel: 617 742-5200 Order "Customers.com" by Patricia Seybold with Ronni Marshak today from Amazon.com -----Original Message----- From: Fabio Arciniegas A. [mailto:l-arcini@uniandes.edu.co] Sent: Wednesday, March 10, 1999 5:10 AM To: XML Mailing List Subject: Req:Music DTD(?) Hello to all, I'm currently working on a xml-based sequencer, and I would like to see some music notation DTDs, before I start to write my own. I've searched the web high and low... no luck so far, so and I was wondering if any of you guys have any pointer I could use. Thanks in advance Fabio -- Fabio Arciniegas A. Ingenieria de Sistemas Uniandes xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Mar 10 16:26:54 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:50 2004 Subject: Namespaces and DTDs References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E6704E.A13B3890@goon.stg.brown.edu> Message-ID: <36E6A0D0.4C4DA307@mecomnet.de> all of which presumes that you've elevated prefixes to the status of uri's - attribute defaults or not. Richard Goerwitz wrote: > > Ronald Bourret wrote: > > > The only inobvious bit is that, because there is no way to declare > > namespaces in the DTD, you can't declare different default namespaces > > for different parts of the DTD > > Because the DTD is not namespace aware, all it can deal with are the pre- > fixes you declare (not the URLs associated with them). Since these pre- > fixes are declared in the document content, you end up with a peculiar > situation in which the DTD has to be written according to declarations > in a given document instance, rather than the reverse. Worse yet, there > is no way to be sure that the various documents being validated against > a particular DTD use the prefixes correctly, with the correct URLs, un- > less you make extensive use of attribute defaults - which, ironically, > means we now need the DTD (probably an external one, typically with a > bunch of parameter entities; so get your validating parser ready). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Mar 10 16:57:12 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:50 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6B1F.58595400@grappa.ito.tu-darmstadt.de> David Megginson writes: > Yeah, no one seems to like it but me. Any other suggestions? I don't > like Parser2 or things like that, because I want to emphasise that > this is an add-on to SAX 1.0 rather than an upgrade. It's a bit long, but how about ExtendedParser? (Actually, I'm rather fond of Parser2 because it gives us a clear path should this be extended in the future.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lloyd at digitaljam.com Wed Mar 10 16:58:30 1999 From: lloyd at digitaljam.com (Lloyd Harding) Date: Mon Jun 7 17:09:51 2004 Subject: SAX RFD: ModSAX Predefined Features Message-ID: <36E68B39.DF535797@digitaljam.com> Lars Marius Garshol wrote: > > * Bill la Forge > > | So that's why I'm butting in here. I think an open standards process > | is important for individuals and small companies. We need to do what > | we can to keep the ball rolling here. > > We are certainly in heartfelt agreement here. :) David Brownell wrote: Gee, as a wage-slave working for a big company, I hope that I'm not _too_ excluded from the discussions ... :-) Seriously: my personal model is a lot more akin to the original IETF style "running code and working consensus" model than most existing standards bodies. I'm a lot happier with standards that come from such a process than from ones that involve fat specs that can't be implemented. Writing code is generally more fun than specs -- though an elegant spec is also a work of art! - - Dave Standards processes require effort and in all cases the effort is primarily provided by individuals from large companies. Small companies do not have the resources to put into standards efforts. Voting members make the difference and they are typically not small company employees. That is not to say standards bodies do not have methods for non-voting input. They all do. There are as many defacto standards that have failed as there are planned standards that have failed. There are as many defacto standards that have succeeded as there are planned standards that have succeeded. To claim one is better than another without details is not sufficient. Personal perception might be based on the the differences in methods for receiving input or differences in the scope or differences in personal preference regarding process. But claiming one is better than the other based on failure/success rate requires more detail regarding definitions of failure/success and analysis of history to be convincing. I believe the issue is not so much which method is best but rather WHEN method A is better than method B. Implementation first versus specification first is similar to deduction versus induction. Both have their places the question is when. lloyd -- ---------------------------------------------------------------- Lloyd Harding lloyd@infoauto.com ---------------------------------------------------------------- Information Assembly Automation Inc. http://www.infoauto.com SGML/XML Services for the Publishing and Medical Community Architectural Design, DTD Creation, Editorial System Development ---------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 17:21:14 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002b01be6b1a$eb1fbcc0$c8a8a8c0@thing1> OK, Dave, you asked for it. As an add on, you have made the SAX parser much more eXtensible. As if we didn't have enough X's... XParser Bill -----Original Message----- From: David Megginson To: XML Developers' List Date: Wednesday, March 10, 1999 12:00 PM Subject: Re: ModSAX: Proposed Core Features >Tim Bray writes: > > > At 08:16 PM 3/9/99 -0500, David Megginson wrote: > > >Here's my revised version of the core feature list, based on recent > > >discussions: > > > > This seems to be converging nicely. Any chance of losing the > > ugly "Mod" prefix? -Tim > >Yeah, no one seems to like it but me. Any other suggestions? I don't >like Parser2 or things like that, because I want to emphasise that >this is an add-on to SAX 1.0 rather than an upgrade. > > >All the best, > > >David > >-- >David Megginson david@megginson.com > http://www.megginson.com/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Mar 10 17:26:32 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <003001be6b1b$c4915360$c8a8a8c0@thing1> On a more serious note, I think we need a new ParserFactory... ModParserFactory? XParserFactory? It should use ParserFactory to create a Parser and then check to see if the new extension is supported. If not, it proceeds to wrap the parser so that it looks like a ModParser. Note that this compatibility wrapper will effectively be a filter. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Wed Mar 10 17:45:10 1999 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: Your message of "Wed, 10 Mar 1999 07:09:59 PST." <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> Message-ID: <199903101744.SAA01077@chimay.loria.fr> tbray@textuality.com said: ] This seems to be converging nicely. Any chance of losing the ugly ] "Mod" prefix? -Tim Why not XSAX for eXtended SAX ? Pat. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Wed Mar 10 17:50:54 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01b001be6b1e$938bdbc0$c5010180@p197> >Why not XSAX for eXtended SAX ? "E-SAX" would be less confusing. -----Original Message----- From: Patrice Bonhomme To: XML Developers' List Date: 10 March 1999 17:48 Subject: Re: ModSAX: Proposed Core Features > >tbray@textuality.com said: >] This seems to be converging nicely. Any chance of losing the ugly >] "Mod" prefix? -Tim > >Why not XSAX for eXtended SAX ? > >Pat. > >-- > ============================================================== > bonhomme@loria.fr | Office : B.228 > http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 > -------------------------------------------------------------- > * Serveur Silfide : http://www.loria.fr/projets/Silfide > ============================================================== > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From luke at javagroup.org Wed Mar 10 18:41:53 1999 From: luke at javagroup.org (Luke Gorrie) Date: Mon Jun 7 17:09:51 2004 Subject: Generating typed code from DTDs, why not? Message-ID: Hi all, I'm pretty new to XML, but as I've poked around I've observed what seem to be some strange things. XML parsers all seem to provide interfaces which ignore the static structure information provided by DTDs and rely on "one fits all" interfaces to elements, in stark contrast to the conventions of statically typed languages. For instance, the first thing I played with in XML was SAX using Python. I was impressed by how easily it worked and how naturally it fit in with a dynamically typed language like python. Then I had a look at the Java interface and found that it was just the same, which I thought very odd! The natural mapping for SAX onto Java, to get the (significant) benefits of static typing, would be to generate a Visitor interface. The Visitor interface would have a method for "visiting" each type of element in the document, and the argument to this method would be an object which presents the element contents through typed accessor methods. At least, that's how it looks to me. In the case of DOM, again generating typed accessor code would provide these great benefits. People could use a DTD (or similar) as the definition language for their abstract data types, and generate DOM-compliant classes which they can both use "natively" in their language and also manipulate as part of a genuine DOM tree at the same time. It seem like these methods which ignore the wealth of static structure information available will begin to show serious problems if they try to scale to the features proposed in some specifications like SOX, where more fine grained relationships and constraints can be expressed. So, my question is: are there any efforts around working towards creating mappings from DTD or other other XML type definition languages to various programming languages (or to other IDLs like OMG's), or is there some reason why this is considered a bad idea? I'm excited by the possibility of using a visual modelling tool (perhaps using an extension of the UML) to model document structure, and from the model be able to generate a DTD, from which to generate classes which give me access to the XML data in a natural way for programming language. I'm amazed that more people don't seem to share this enthusiasm. What we're doing with vanilla DOM and SAX interfaces seems analogous to using CORBA IDL as documentation, and making all object calls using the dynamic invocation interface! P.S. I was told today that Oracle have recently done something similar to this, which sounds great. I look forward to taking a look, but I can't help but wonder if there's a reason that it took this long - and how much the Oracle product does. If someone could point me to some other products which do similar things, I'd be much obliged. Cheers, Luke xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Wed Mar 10 19:16:58 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:51 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <3601b6f9.100299@smtpgate1.ONE2ONE.CO.UK> Hi all, I am using IBM's xmlj2.0.3 XML parsers. I am having the following problem: When i set my own document handler with a DOMParser, the handler is never invoked upon. However when i use the SAXParser it does. Why does the DOMParser not invoke the DocumentHandler yet hte SAXParser does? The docs does not throw any light on the problem. Is there a fundamental problem with using a DocumentHandler with a DOMParser? -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Wed Mar 10 19:24:11 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:51 2004 Subject: WIDL Message-ID: <001001be6b2b$7d49d8f0$01010101@development.cascade> Is anybody doing a B2B/WIDL type of application? Will I be able to use regular HTML pages (and maybe CGI/Pearl) to push and pull XML from a remote server, and easily be able to parse the XML on both sides, looking for custom request/reply types of data and then act on it (via JavaScript or applets on the client, and maybe servlets on the server? Am I dreaming to think that this can give me a light weight remoting technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc. Chad Adams Payback Training Systems Email: cadams@cascadecc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From macherius at darmstadt.gmd.de Wed Mar 10 20:21:41 1999 From: macherius at darmstadt.gmd.de (Ingo Macherius) Date: Mon Jun 7 17:09:51 2004 Subject: WIDL In-Reply-To: <001001be6b2b$7d49d8f0$01010101@development.cascade> Message-ID: <199903102020.VAA15937@sonne.darmstadt.gmd.de> Chad Adams wrote at 10 Mar 99, 12:23: > Is anybody doing a B2B/WIDL type of application? There is a lot of research going on with in the area of wrapper generation. Some approaches prefer creating java objects, others directly map to XML. Implementations include: http://db.cis.upenn.edu/W4F/ http://www.cse.ogi.edu/DISC/XWRAP/ http://www.darmstadt.gmd.de/oasys/projects/jedi/jedie.html Just look at the bibliographies to find others. > Am I dreaming to think that this can give me a light weight remoting > technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc. Have a look at XML query languages, they are about that (among other things). http://www.w3.org/TandS/QL/QL98/ A good paper to start with is from David Maier, look at sections 2.9 and 2.10 to see his Vision of data communication via XML on the web. http://www.w3.org/TandS/QL/QL98/pp/maier.html And of course there is Microsoft's vision, see http://www.oasis-open.org/cover/bosworthXML98.html Hope that helps. ++im -- Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882 GMD-IPSI German National Research Center for Information Technology mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Wed Mar 10 20:38:05 1999 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...) References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> <14052.19853.887104.987727@localhost.localdomain> Message-ID: David> David Megginson [I accidentally mailed this to David; it was meant for the list. Sorry, David.] 0> In article <14052.19853.887104.987727@localhost.localdomain>, David 0> wrote: David> I've been thinking about this issue, and I'm fairly convinced David> that the URI is the right choice. I agree with this much. David> Think of the URI a statement of ownership. Assume that my ISP David> is host.net, and that I've been allocated 5MB of web space at David> http://host.net/foo/. Okay, you own that name subspace *at this moment in time*. Who will have the right to create names below that next March? Five years from now? A hundred years from now? Persistent uniqueness of names is the core work of the URN group, and the consensus there is that DNS names are a poor basis for any kind of URN (and what we want is exactly what URNs are for: naming things). If you are saying that the use of URLs as names is just a stopgap until the URN registration stuff is sorted, then I'll accept that, but be aware of the precedent you're setting with the initial "well-known" feature names. David> I am the only one who has the right to make a resource available at David> http://host.net/foo/, so I am the one who has the (moral) right to David> construct feature IDs based on http://host.net/foo/. At this instant... David> It is not sufficient simply to use the domain name "host.net", David> because I don't own the domain (someone else could construct David> the same feature ID), and it is not sufficient to use something David> starting with net.host.foo, because I *don't* have the right to David> make something available at, say, ftp://host.net/foo/ -- Nor do you own the host "foo.host.net" In summary, I think URNs are a good fit, but not necessarily other kinds of URI. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Wed Mar 10 22:08:35 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <199903101744.SAA01077@chimay.loria.fr> References: Message-ID: <4.1.19990311090304.00c96360@steptwo.com.au> At 03:44 11/03/1999 , Patrice Bonhomme wrote: | | tbray@textuality.com said: | ] This seems to be converging nicely. Any chance of losing the ugly | ] "Mod" prefix? -Tim | | Why not XSAX for eXtended SAX ? Damn, you beat me to it. Although I was thinking SAX eXtended, ie: SAXX This could later become SAXXX or perhaps: 3 SAX J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Wed Mar 10 22:33:02 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:51 2004 Subject: Namespaces and DTDs Message-ID: For a more complete solution than the option (emphasize option) of a DTD associated with a namespace prefix and URI, I would add the means to declare a namespace, prefix and DTD in a DTD. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: james anderson [SMTP:James.Anderson@mecomnet.de] Sent: Wednesday, March 10, 1999 8:42 AM To: xml-dev@ic.ac.uk Subject: Re: Namespaces and DTDs all of which presumes that you've elevated prefixes to the status of uri's - attribute defaults or not. Richard Goerwitz wrote: > > Ronald Bourret wrote: > > > The only inobvious bit is that, because there is no way to declare > > namespaces in the DTD, you can't declare different default namespaces > > for different parts of the DTD > > Because the DTD is not namespace aware, all it can deal with are the pre- > fixes you declare (not the URLs associated with them). Since these pre- > fixes are declared in the document content, you end up with a peculiar > situation in which the DTD has to be written according to declarations > in a given document instance, rather than the reverse. Worse yet, there > is no way to be sure that the various documents being validated against > a particular DTD use the prefixes correctly, with the correct URLs, un- > less you make extensive use of attribute defaults - which, ironically, > means we now need the DTD (probably an external one, typically with a > bunch of parameter entities; so get your validating parser ready). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Wed Mar 10 22:42:35 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:09:51 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <30649320C177D111ADEC00A024E9F297169F8A@exchange-server.dega.com> Hi all, I am using IBM's xmlj2.0.3 XML parsers. I am having the following problem: When i set my own document handler with a DOMParser, the handler is never invoked upon. However when i use the SAXParser it does. Why does the DOMParser not invoke the DocumentHandler yet hte SAXParser does? The docs does not throw any light on the problem. Is there a fundamental problem with using a DocumentHandler with a DOMParser? One2One LUCIO.PICCOLI@one2one.co.uk I don't know about the version of your XML4J, but in mine (1.1.9), the documentation states that DocumentHandler is to be used with the SAX Parser to eb informed of parsing events. This is logical, since the main difference is that DOM parsers parse the whole document into a resulting DOM tree, and SAX parsers are used for event based processing. There doesn't appear to be any way to create a DocumentHandler on class com.ibm.xml.parser.Parser, but you can from org.xml.sax.DocumentHandler. Did they change this in your newer version? Ed Ed Howland ed@dega.com http://www.dega.com "As your attorney, I advise you to take some adrenalchrome" -----Original Message----- From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk] Sent: Wednesday, March 10, 1999 11:12 AM To: xml-dev@ic.ac.uk Subject: DocumentHandler with xml4j DOMParser xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Mar 10 22:44:39 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features Message-ID: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> > > This seems to be converging nicely. Any chance of losing the > > ugly "Mod" prefix? -Tim > >Yeah, no one seems to like it but me. Any other suggestions? I don't >like Parser2 or things like that, because I want to emphasise that >this is an add-on to SAX 1.0 rather than an upgrade. I have been tracking the progress of 'ModSAX' closely as well and it seems the extension is maturing nicely. BTW, it would help great in renaming if you could tell us what 'Mod' in ModParser stands for. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Mar 10 23:41:33 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:51 2004 Subject: ModSAX: Proposed Core Features References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> Message-ID: <36E70238.18FC6360@manhattanproject.com> Don Park wrote: > BTW, it would help great in renaming if you could tell us what 'Mod' in > ModParser stands for. I thought it stood for "Modular" :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Mar 11 00:06:18 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:51 2004 Subject: DOM Impl: Array or Linked List? References: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK> Message-ID: <36E7080F.7EEF4CEB@manhattanproject.com> I've been struggling with this slightly, and would like your feedback. I'm building a DOM tree. For the internal representation, I see two options: A) A linked list for children * Easy inserts in middle of list * Slower non-sequential reads B) An array for children * Harder inserts in middle of list * Faster non-sequential reads Anyway, I was thinking of implementing a compromise, a sparse array with configurable spacing, depending upon the document. Thoughts? Thank you. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 00:48:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main> Message-ID: <14055.4677.914570.392597@localhost.localdomain> Don Park writes: > BTW, it would help great in renaming if you could tell us what > 'Mod' in ModParser stands for. It means that it's not a Rocker. Or else it means 'modular' -- I'm not sure. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Thu Mar 11 00:54:36 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:52 2004 Subject: Namespaces and DTDs Message-ID: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Richard Goerwitz wrote: > (Simplicity _was_ one of XML's primary goals back in the dark ages last > February.) It seems to me that the SGML compatibility requirement killed simplicity. (And gave a very confusing and hard-to-learn vocabulary) I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent) will be able to influence XML in a positive direction, making it simpler to understand, use and implement. Today it's way too hard to "fully" understand XML. Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 01:04:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...) In-Reply-To: References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> <14052.19853.887104.987727@localhost.localdomain> Message-ID: <14054.56958.252482.1690@localhost.localdomain> [originally sent privately to Tony] Toby Speight writes: > If you are saying that the use of URLs as names is just a stopgap > until the URN registration stuff is sorted, then I'll accept that, > but be aware of the precedent you're setting with the initial > "well-known" feature names. The quality of URNs will depend entirely on the quality of the registration schemes -- URNs really have no inherent advantage over URLs. There are an awful lot of ways that I could construct a unique ID: using my phone number, my latitude and longitude, my Ethernet card's MAC address, the IP address served by Rogers Wave's DHCP server, my driver's license number, my Canadian Social Insurance Number, the ISBN for my book (though I think the publisher would have a moral claim to that), a domain name, or a specific URL. The problem is that you have to balance four factors: 1. ease of access (not everyone can get an ISBN easily); 2. usability (who wants to memorise MAC addresses?); 3. universality (my Canadian SIN is meaningless outside the country); and 4. persistence (the DHCP server might change my IP address in a few hours when my current lease expires). HTTP URLs win pretty close to a 10/10 on (1) and (3), about an 8/10 on (2), and probably a 6/10 or so on (4). A UUID might win on all but (2), depending on how hard it is to obtain one, but that is an inherent property of UUIDs, not of URNs -- and as I understand it, people are actually proposing constructing URNs from domain names among other schemes anyway. Even if UUIDs do turn out to be the best choice, what's the advantage of URNs? Why not just uuid:123344567773634 All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 01:21:24 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: DOM Impl: Array or Linked List? Message-ID: <002301be6b5d$5e5434e0$2ee044c6@arcot-main> Docuverse DOM SDK is implemented using the array approach with the last accessed index cached to improve next/prevSibling performance. Resulting implementation is fast for index-based access to child nodes and slightly slower for sibling-based access (only 10% slower than linked-list version). Modification to the tree is fast when appending nodes (i.e. building new tree) but is somewhat slow when inserting new nodes since array contents have to be shifted around. If your XML document has gazillion child nodes per element, performance will suffer quite a bit. You can get around the update problem by applying the Strategy pattern to child array implementation. On insert, check to see if the array is big enough to justify using different type of array implementation (i.e. sparse array). One caveat is that this tends to increase the number of child list array (smart NodeLists and NodeList implementation strategies). There are ways to minimize this problem though. So the bottom line is, you are on the right track. Don Park Docuverse -----Original Message----- From: Clark Evans To: xml-dev@ic.ac.uk Date: Wednesday, March 10, 1999 4:15 PM Subject: DOM Impl: Array or Linked List? >I've been struggling with this slightly, and would >like your feedback. I'm building a DOM tree. For >the internal representation, I see two options: > >A) A linked list for children > >* Easy inserts in middle of list >* Slower non-sequential reads > >B) An array for children > >* Harder inserts in middle of list >* Faster non-sequential reads > >Anyway, I was thinking of implementing >a compromise, a sparse array with >configurable spacing, depending upon >the document. > >Thoughts? > >Thank you. > >Clark > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 01:21:28 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002401be6b5d$5f34d0e0$2ee044c6@arcot-main> If it is 'modular' then ModularParser makes sense IMO. I believe XParser is being used by FBI for the technology that auto-detects images with adult content. Don Park Docuverse -----Original Message----- From: David Megginson To: XML Developers' List Date: Wednesday, March 10, 1999 4:52 PM Subject: Re: ModSAX: Proposed Core Features >Don Park writes: > > > BTW, it would help great in renaming if you could tell us what > > 'Mod' in ModParser stands for. > >It means that it's not a Rocker. > >Or else it means 'modular' -- I'm not sure. > > >All the best, > > >David > >-- >David Megginson david@megginson.com > http://www.megginson.com/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Mar 11 02:35:58 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:52 2004 Subject: Simplicity (was Re: Namespaces and DTDs) References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Message-ID: <36E72BDE.6798F08E@allette.com.au> Jarle Stabell wrote: > It seems to me that the SGML compatibility requirement killed simplicity. > (And gave a very confusing and hard-to-learn vocabulary) Really? I think the requirement for web compatibility made XML more complex than it looked from the outset. This is the advent of the third catchcry for XML. First it was "XML is SGML", second was "Use XML because SGML is too hard" and now "XML is very powerful, but can be difficult". Remarkably, we just now seem to be coming to the realisation that it's difficult to solve complex problems. XML seeks to do more than SGML, but it's supposed to be simpler - how can this be so? The only immediate areas of gain would have come from trimming the fat from the SGML, but the more the X*L I see, the skinner SGML looks. Yes, it is less powerful, yes it can be more proprietary, yes it is harder to write tools for, no it doesn't solve ten percent of what X*L can do before it even gets out of bed. Yes, I still use it a lot. Ponder that - SGML for simplicity. > I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent) > will be able to influence XML in a positive direction, making it simpler to > understand, use and implement. Today it's way too hard to "fully" > understand XML. It is unquestionably hard to fully understand - anyone who says that it isn't deserves a gold star - they're smarter than I am. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msharp at sybex.com Thu Mar 11 03:04:42 1999 From: msharp at sybex.com (Molly Sharp) Date: Mon Jun 7 17:09:52 2004 Subject: Delivery of XML Message-ID: <88256731.000FD31A.00@sybex.com> Hello, I'm new to the list. I'm in the computer book publishing business, and I'm looking for information about delivering XML content to customers in a secure, copy-protected (encrypted) manner. Does anyone know if there are any companies out there offering secure encryption for XML? I imagine you'd have to create a browser based on IE or Netscape that disabled functions such as view source, copy, and save as --- and that would be the only browser your encrypted XML content could be opened from. Thanks for any information about this, Molly Sharp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 03:10:00 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:52 2004 Subject: A new name for ModSax Message-ID: <146803a9.36e732d1@aol.com> Hi Everyone, Instead of XSAX or XParser (which rely on the overplayed X of extensible), how about ExSAX ExParser Which stands for the same thing. Extensible SAX Extensible Parser Is shorter to type than ModSAX. Avoids the double capital of XSAX and XParser. And is pronounced the same way. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Mar 11 03:34:23 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:52 2004 Subject: Delivery of XML In-Reply-To: <88256731.000FD31A.00@sybex.com> Message-ID: <000c01be6b6f$309a19e0$d3228018@jabr.ne.mediaone.net> > > Does anyone know if there are any companies out there offering secure > encryption for XML? I imagine you'd have to create a browser > based on IE or > Netscape that disabled functions such as view source, copy, and > save as --- > and that would be the only browser your encrypted XML content could be > opened from. > Would that be SSL with certificates to distinguish clients? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 03:42:30 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:52 2004 Subject: One more ModSax naming try... Message-ID: <8179a506.36e73846@aol.com> Hi All, Ok, while I like ExSax for the previously mentioned reasons -- I don't like its connotation for all things "Ex" like Ex-girlfriend, Ex-wife, Ex-husband... So, one other way to go is the "Add-on" theme that David expressed. XtraSax XtraParser This is a combination of "add-on", "extra" and Xml. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From landerse at du.edu Thu Mar 11 05:29:02 1999 From: landerse at du.edu (Buzz Andersen) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <0F8F007FZ0J3BS@du.edu> I would be eternally grateful if anyone out there who happens to be familiar with the Docuverse DOM SDK could tell me what is wrong with the following code. It generates a "com.docuverse.dom.DOMExceptionImpl" exception when the "appendChild" method of the document is attempted. Here it is: DOM dom = new com.docuverse.dom.DOM(); dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); Document x = dom.createDocument("e1"); Element y = x.createElement("e2"); y.appendChild(root); I would think this would generate: Am I mistaken? Thanks in advance, Buzz Andersen www.du.edu/~landerse landerse@du.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Mar 11 05:59:23 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <004a01be6b84$379613b0$2ee044c6@arcot-main> Buzz, >DOM dom = new com.docuverse.dom.DOM(); >dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); >Document x = dom.createDocument("e1"); >Element y = x.createElement("e2"); >y.appendChild(root); > >I would think this would generate: > > > > I don't know what y.appendChild(root) is supposed to be but you have to insert your "e2" element into your document. // creates a document with "e1" as document element Document doc = dom.createDocument("e1"); // make sure document root exists Node e1 = doc.getDocumentElement(); if (e1 == null) e1 = doc.appendChild(doc.createElement("e1")); // create and insert e2 into e1 e1.appendChild(doc.createElement("e2")); at this point, you will have: Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From landerse at du.edu Thu Mar 11 06:41:52 1999 From: landerse at du.edu (Buzz Andersen) Date: Mon Jun 7 17:09:52 2004 Subject: Help w/Docuverse DOM SDK (Please) Message-ID: <0F8F0004G3W74A@du.edu> Whoa...that was a mistranslation from my original code! It was supposed to read: x.appendChild(y); Sorry about the confusion, and thanks much for the advice/code. I've been generating XML for awhile using proprietary parser APIs, but I'm still trying to grok the whole SAX/DOM thing. Buzz Andersen www.du.edu/~landerse landerse@du.edu ---------- >From: Don Park >To: xml-dev@ic.ac.uk >Subject: Re: Help w/Docuverse DOM SDK (Please) >Date: Wed, Mar 10, 1999, 10:58 PM > >Buzz, > >>DOM dom = new com.docuverse.dom.DOM(); >>dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver"); >>Document x = dom.createDocument("e1"); >>Element y = x.createElement("e2"); >>y.appendChild(root); >> >>I would think this would generate: >> >> >> >> > >I don't know what y.appendChild(root) is supposed to be but you have to >insert your "e2" element into your document. > >// creates a document with "e1" as document element >Document doc = dom.createDocument("e1"); > >// make sure document root exists >Node e1 = doc.getDocumentElement(); >if (e1 == null) > e1 = doc.appendChild(doc.createElement("e1")); > >// create and insert e2 into e1 >e1.appendChild(doc.createElement("e2")); > >at this point, you will have: > > > >Best, > >Don Park >Docuverse > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on >CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Thu Mar 11 08:22:49 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:52 2004 Subject: DocumentHandler with xml4j DOMParser Message-ID: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK> > > Hi all, > I am using IBM's xmlj2.0.3 XML parsers. I am having the following > problem: > When i set my own document handler with a DOMParser, the > handler is never > invoked upon. However when i use the SAXParser it does. Why > does the > DOMParser not invoke the DocumentHandler yet hte SAXParser does? > The docs does not throw any light on the problem. > Is there a fundamental problem with using a DocumentHandler with a > DOMParser? > > One2One LUCIO.PICCOLI@one2one.co.uk > > > I don't know about the version of your XML4J, but in mine (1.1.9), the > documentation states that DocumentHandler is to be used with > the SAX Parser > to eb informed of parsing events. This is logical, since the > main difference is that DOM parsers parse the whole document into a resulting > DOM tree, and SAX parsers are used for event based processing. > > There doesn't appear to be any way to create a > DocumentHandler on class com.ibm.xml.parser.Parser, but you can from > org.xml.sax.DocumentHandler. >Did they change this in your newer version? I am not sure what you mean here. The Documenthandler i used was a instance of org.xml.sax.DocumentHandler. The setDocumentHandler(DocumentHandler handler) is a method on the org.xml.sax.Parser. Since all the ibm parser class implement this interface then why doesn't it work? I viewed the source code to the DOMParser and noticed that in the constructor it calls setDocumentHandler( this ). So it using itself as the document handler. Is it OK to have more than one DocumentHandler? In fact the bigger question is using a DocumentHandler on the DOMParser the correct thing to do when attempting to extract the content? -lucio > > Ed > > > Ed Howland > ed@dega.com > http://www.dega.com > "As your attorney, I advise you to take some adrenalchrome" > > -----Original Message----- > From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk] > Sent: Wednesday, March 10, 1999 11:12 AM > To: xml-dev@ic.ac.uk > Subject: DocumentHandler with xml4j DOMParser > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Thu Mar 11 09:49:32 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: Message-ID: <36E7951F.BD56A8E4@mecomnet.de> the third parameter (the DTD) is ill advised. one will, in any case, need to establish scoping rules for the bindings. such rules, in combination with xml's existing reference and sequence mechanisms, would render the third parameter either redundant or too restrictive. Marc.McDonald@Design-Intelligence.com wrote: > > For a more complete solution than the option (emphasize option) of a > DTD associated with a namespace prefix and URI, I would add the means > to declare a namespace, prefix and DTD in a DTD. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 10:12:24 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <024301be6ba6$dfdc1c00$5402a8c0@oren.capella.co.il> Bill la Forge wrote: >On a more serious note, > >I think we need a new ParserFactory... ModParserFactory? XParserFactory? >It should use ParserFactory to create a Parser and then check to see if the new >extension is supported. If not, it proceeds to wrap the parser so that it looks >like a ModParser. > >Note that this compatibility wrapper will effectively be a filter. I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread has focused on "how to access extra functionality which is already available within a particular SAX parser implementation". This might be the wrong question to ask. Shouldn't it be "how to I obtain an instance of a SAX parser which provides the features I need", instead? This is a subtle but important shift of focus. Today one can obtain an instance of a SAX parser by using the ParserFactory. Now suppose my application needs an order of a namespace aware parser, character normalization on the side, and don't spare the comments, please - how would I go around creating such a thing? Note that this issue contains the original one; one needs to be able to access the extra features. But it goes beyond it. It might also help to constrain some design choices. Take for example the issue of naming features. Today ParserFactory uses the string "org.xml.sax.parser" as an identifier for the feature "take an input source and convert it to SAX events". The format of this particular string was chosen since it is usable as a key in a properties file. Wouldn't it be reasonable to say that whichever way Mod/X/Xtra/E-ParserFactory works, it will use the same approach - that is, use Java-like package names to identify features, so that it will be possible to provide default implementations using property files? I know this would be hard for the URI camp to swallow :-) but isn't it worth it? As to the issue itself, the way I see it there is one major question to be decided first. Are the extra features independent of each other? If they aren't, we are in trouble. How do I know that pushing a filter implementing feature X on top of a parser implementing feature Y doesn't break that feature? What if one feature depends on another? Should there be a way to describe the relationship between features? How? At any rate, the goal should be some registry of "parsers" and "filters" with an appropriate API so that it would be possible to ask for a certain feature set and obtain a "parser" instance. IMVHO as far as this registry is concerned, the basic SAX events interface and the input source interface should be on equal ground with the other features. This could be a flexible framework allowing to create processing chains such as using DOM as input/output of the chain, making XSL processing a core "feature", and so on. Has anything similar been done in a different field, so we could reuse the design lessons there? It seems like a pretty generic "stream processing" problem. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 11 11:54:40 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> Message-ID: <36E7AE27.7DE6@hiwaay.net> Jarle Stabell wrote: > > Richard Goerwitz wrote: > > (Simplicity _was_ one of XML's primary goals back in the dark ages last > > February.) > > It seems to me that the SGML compatibility requirement killed simplicity. > (And gave a very confusing and hard-to-learn vocabulary) Or its inventors have discovered that assuming the mission of an existing mature standard without acknowledging the complexity of that mission leads to the same or worse complexity in the invention. Darn. Maybe LISP was the right language after all and forty years of computer scientists just didn't "get it". len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 11 11:56:30 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features References: <4.1.19990311090304.00c96360@steptwo.com.au> Message-ID: <36E7AD18.486C@hiwaay.net> James Robertson wrote: > > Although I was thinking SAX eXtended, ie: > > SAXX > > This could later become > > SAXXX > > or perhaps: > > 3 > SAX At which point the local firewall chokes again, tosses up the warning message about unacceptable sites and local policies, accounts get flagged, and the whole nine yards of censorial software and American puritanism kicks in. Call it Sax++. Incrementally better. ;-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Mar 11 12:20:33 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:09:53 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> Message-ID: <36E7B4B3.999188F5@mitre.org> Hi Folks, There has been a lot of discussion on this list group about namespaces and how there is no necessary link between a namespace URI and a schema (DTD). Just as I was accepting that and getting comfortable with it I read the RDF spec... For those of you unfamiliar with RDF, its mission in life is to enable you to express data about your data; i.e., metadata. You can express things like, "the creator of the BookCatalog is John Doe". "creator" is a piece of metadata about the "resource", BookCatalog. In RDF "creator" is called a "property". Thus, the "property" has the "value" John Doe ... John Doe. Okay, here's where the rub comes. Let me give you a couple of quotes from the RDF spec (the *'s I have put in and are my way of emphasizing the words that I wish for you to really focus on): "Property names *must* be associated with a schema. This can be done by qualifying the element names with a namespace prefix to unambigously *connect* the property definition with the corresponding RDF schema ..." Earlier in the spec it says: "Due to RDF's incremental extensibility, agents processing metadata will be able to trace the origins of schemata they are unfamiliar with back to known schemata and perform meaningful actions on metadata they weren't originally designed to process." Let me tell you how I interpret those two sentences. Suppose that I haver written a Web agent and it comes across a Web site that serves up an XML document containing some metadata (expressed using the RDF syntax). Let's suppose that the metadata says, in XMLese, "the creator of the BookCatalog is John Doe". My agent has never seen the property "creator", so it follows the namespace URI to the property schema. From there it finds the superclass of the creator property. If it doesn't recognize that class then it goes to its superclass. It keeps doing this until it finds a class that it understands and then it starts unwinding (presumably by this process it will be able to gain insight into what "creator" is all about. I have no idea how this will happen, but it sounds pretty cool.) This mechanism of following references until the agent gains "enlightenment" makes sense to me. I like it! ***However*** that presupposes that there is a *guaranteed* association between a namespace URI and a schema. This is totally against what this list group has worked so hard to clarify as NOT being the case. Somebody help me to understand this. Obviously I am misreading, misinterpreting the RDF spec. Thanks. /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 12:31:05 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: One more ModSax naming try... Message-ID: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> From: MikeDacon@aol.com >So, one other way to go is the "Add-on" theme that David expressed. > >XtraSax >XtraParser > >This is a combination of "add-on", "extra" and Xml. What about open? OpenParser/OpenSAX. With the new extensions, we are not constrained by the interface--its quite "open". Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 12:39:59 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <002d01be6bbc$a014e460$c8a8a8c0@thing1> From: Oren Ben-Kiki >I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread >has focused on "how to access extra functionality which is already available >within a particular SAX parser implementation". This might be the wrong >question to ask. Shouldn't it be "how to I obtain an instance of a SAX >parser which provides the features I need", instead? It is interesting how small shifts in perspective can have major design implications. I just wanted to make it easy for new ModSAX applications to use older SAX parsers without requiring any extra code in the application. If ModSAX is to remain low-level, I suspect a registry is out of scope. As for building up a parser with filters to meet a set of requirements automagically, I'd rather give more control to the application to specify what it needs, than try to compose something based on a feature list. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 11 13:11:07 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6BC8.E71E5AB0@grappa.ito.tu-darmstadt.de> Oren Ben-Kiki wrote: > Has anything similar [assembling processors based on feature requests] > been done in a different field, so we could reuse the > design lessons there? It seems like a pretty generic "stream processing" > problem. I think there is an inherent assumption in this question that we are defining individual features that can be implemented by different parties and then randomly assembled to get a useful processor. While this is potentially a useful thing to do -- UNIX pipes are a good example -- it is not necessarily an easy thing to do, nor is it clear that this is a goal of ExModE-XSAX. We tried to do a similar thing in OLE DB, where database functionality would be broken down into individual services which could be assembled at will on top of a database driver. (Generally, this would be meaningful only for drivers for non-database sources, as drivers for existing databases already exposed most/all functionality.) The idea never really worked out, but here are some of the issues: * Are there enough useful features/components to make this worthwhile? For OLE DB, the answer was "probably not". We implemented a scrollable cursor (basically just a result set cache), but other ideas (transactions, security) were not easily implementable as separate layers and were not really meaningful -- anybody could get around them by excluding the layer. * What are the interfaces between components and how hard are they to implement? If you want to be able to assemble components from different vendors at will, these need to be defined. The success of SAX filters is a red herring here -- it leads one to believe that SAX can function as a useful interface for all XML-related processing features. In fact, this is not the case -- for example, whether or not to retrieve external entities has nothing to do with SAX. Thus, other interfaces would need to be defined to be able to assemble processors from third-party components. (I think this is one thing that led us astray in OLE DB. The usefulness of a scrollable cursor engine that spoke OLE DB at both ends led us to believe that the same could be done with other database features. In fact, OLE DB was less well suited or completely unsuited for other operations. In addition, it was expensive to implement.) * How independent are the features? Is it meaningful to ask for one thing but not another, such as wanting validation without namespaces (maybe) or parsing external entities (no)? Again, I think the orthogonality of some features is a red herring leading one to believe all features are orthogonal. * Are performance penalties too high to separate features into separate components? For example, suppose several features need to process XML documents as trees. While it might make sense to write a single processor for these features and toggle them within the processor, the performance hit of implementing them as separate, chained processors would be too high: each would have to build a tree, process it, and then stream it back out as SAX. * Are there order dependencies between components? For example, if you want validation and namespace processing as separate components, you had better do namespace processing first. An open question is who knows about order and how is it advertised. * Who assembles the components -- the application, the processor, or a third party? The advantage of a processor or third party (such as a factory) assembling components is that you need the assembly logic in only a few places. The disadvantage is that applications that know about a new feature cannot use that feature until the assembly logic in the processor/factory is updated. It is probably best to have a mechanism that allows both processors and applications to assemble components. My personal feeling is that assembling XML processors completely on the fly is a pipe (if you will excuse the pun) dream. The world is simply not o rthogonal enough to make this possible. Furthermore, there are too many performance gains to be had by tight integration of functionality to ever convince people to build things entirely as components with public interfaces. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 13:30:48 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: ModSAX: Proposed Core Features Message-ID: <02a501be6bc2$9da75810$5402a8c0@oren.capella.co.il> Bill la Forge wrote: >If ModSAX is to remain low-level, I suspect a registry is out of scope. As for building >up a parser with filters to meet a set of requirements automagically, I'd rather give >more control to the application to specify what it needs, than try to compose >something based on a feature list. A registry might be outside the scope of ModSAX (but see below). Even if it is, I feel that we should take care that ModSAX design choices won't make such a registry unnecessarily difficult. It might also be that a "registry" is the wrong way to go; John Cowan, for example, suggested a mechanism to allow a parser to automatically push a filter between itself and the application. I'm certain there are other reasonable approaches. All I'm saying is that before we decide on ModSAX, some thought should be given to this issue. To get the ball rolling, how about the following low level solution, which would allow smarter high level solutions later on: class ModSAXRegistry { static void setClassFeatures(String className, String[] featureNames); static String[] getClassFeatures(String className); static Enumeration getFeatureClasses(String featureName); static Object newInstance(String className); } The idea being that it would be easy to get a list of classes which provide any requested feature, and check which features are implemented by a particular class. This should be trivial to implement; static code could do the registration automatically, or it could be loaded from property files, environment variables, or whatever. We already have one standard feature: "org.xml.sax.parser", to which we should probably add "org.xml.sax.filter". The question of how to build a parser implementing a particular feature would be left open. In general the application would query the registry, use whatever algorithm it likes to decide on which classes to use, instantiated them and go on as per the current ModSAX interface. Once enough experience is gained using this, we could decide to add some methods which implement popular algorithms. Compatibility with the current state: It should be trivial to implement ParserFactory above the registry. As for property files, the following scheme is safe and upward compatible with today's practice of providing the SAX parser name in "org.xml.sax.parser": org.xml.sax.class.=,,... = The whole thing is as lightweight and low-level as you can get. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Thu Mar 11 14:21:43 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:53 2004 Subject: One more ModSax naming try... Message-ID: <5f01bac8.36e7bcfd@aol.com> Hi Bill, In a message dated 3/11/99 7:27:22 AM Eastern Standard Time, b.laforge@jxml.com writes: > > What about open? OpenParser/OpenSAX. > With the new extensions, we are not constrained by the interface--its quite " > open". > I like OpenParser/OpenSAX!! Besides the open/extensible link, it gives a nod to open source which is appealing. - Mike (www.gosynergy.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dkirsch at quintcom.com Thu Mar 11 14:44:15 1999 From: dkirsch at quintcom.com (dkirsch@quintcom.com) Date: Mon Jun 7 17:09:53 2004 Subject: Delivery of XML Message-ID: <88256731.00509FF2.00@mercury.quintcom.com> Molly, I understand that IBM will make a presentation at the IETF meeting next week for just this type of support. I'll see if I can get you a contact for that while I'm here at the XTECH conference today. Cheers, David K. "Molly Sharp" on 03/10/99 07:00:57 PM Please respond to "Molly Sharp" To: SGML-L@RELAY.URZ.UNI-HEIDELBERG.DE, xml-dev@ic.ac.uk cc: (bcc: David Kirsch/QCI) Subject: Delivery of XML Hello, I'm new to the list. I'm in the computer book publishing business, and I'm looking for information about delivering XML content to customers in a secure, copy-protected (encrypted) manner. Does anyone know if there are any companies out there offering secure encryption for XML? I imagine you'd have to create a browser based on IE or Netscape that disabled functions such as view source, copy, and save as --- and that would be the only browser your encrypted XML content could be opened from. Thanks for any information about this, Molly Sharp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Mar 11 15:06:37 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:53 2004 Subject: DOM Impl: Array or Linked List? Message-ID: <85256731.0052D629.00@D51MTA03.pok.ibm.com> As a contrasting point, my com.ibm.domimpl operates on the linked-list approach. I considered changing that, but decided that for the applications I anticipated folks to be writing in Java, integer indexing was going to be relatively rare compared to next and previous, and performing the additional work to maintain the indices didn't feel like it was going to be a net gain. I'm firmly convinced that there's no such thing as one best way to implement the DOM. There are too many issues to trade off which will make an implementation better at one thing than another. The fastest DOM may need more storage space for the model; the smallest model may require more code; the smallest code may be slower. Also, don't forget that the DOM is strictly an API, which can be wrapped around any model that can contain a document; there may be DOMs which are really just thin access layers for databases, for example. Pick, or write, the DOM that suits your intended application(s). Hammers make poor screwdrivers, and vice versa. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cadams at cascadecc.com Thu Mar 11 15:22:26 1999 From: cadams at cascadecc.com (Chad Adams) Date: Mon Jun 7 17:09:53 2004 Subject: Java DOM Parsers Message-ID: <000001be6bd2$e0059900$01010101@development.cascade> What companies supply java DOM API's and other xml api tools? Any suggestions on which to go with? Thanks Chad Adams Payback Training Systems Email: cadams@cascadecc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 15:41:18 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs References: <36E7951F.BD56A8E4@mecomnet.de> Message-ID: <36E7E379.D3BE5204@goon.stg.brown.edu> James Anderson wrote (with regard to declaring namespaces in the DTD): > one will, in any case, need to establish scoping rules for the bindings That's a very insightful comment, and right on target about DTDs. But back to an earlier point a poster made about SGML-conformance (DTDs, etc.) being the thing that is killing XML: If it weren't for the promise of backwards compatibility with SGML/HTML, XML could not have gathered the initial following that it did. (Don't get me wrong; our shop is still largely an SGML shop. I'll be very sad if XML loses these connections. But I think that's where we are headed. Many people who are entering the XML community have never heard of SGML, and resent being encumbered it.) -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Mar 11 15:56:53 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:53 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> I asked: >> Has anything similar [assembling processors based on feature requests] >> been done in a different field, so we could reuse the >> design lessons there? It seems like a pretty generic "stream processing" >> problem. Ronald Bourret wrote: >I think there is an inherent assumption in this question that we are >defining individual features that can be implemented by different parties >and then randomly assembled to get a useful processor. While this is >potentially a useful thing to do -- UNIX pipes are a good example -- it is >not necessarily an easy thing to do, nor is it clear that this is a goal of >ExModE-XSAX. Well, at least the idea warrants some serious thought. >We tried to do a similar thing in OLE DB, where database functionality >would be broken down into individual services which could be assembled at >will on top of a database driver. (Generally, this would be meaningful >only for drivers for non-database sources, as drivers for existing >databases already exposed most/all functionality.) The idea never really >worked out, but here are some of the issues: > >* Are there enough useful features/components to make this worthwhile? Good question. For SAX I'd say "probably yes". Here's a list of features (courtesy of David Megginson): > http://xml.org/sax/features/validation > Validate (true) or don't validate (false). > http://xml.org/sax/features/external-general-entities > Expand external general entities (true) or don't expand (false). > http://xml.org/sax/features/external-parameter-entities > Expand external parameter entities (true) or don't expand (false). > http://xml.org/sax/features/namespaces > Preprocess namespaces (true) or don't preprocess (false). See also > the http://xml.org/sax/properties/namespace-sep property. > http://xml.org/sax/features/normalize-text > Ensure that all consecutive text is returned in a single callback to > DocumentHandler.characters or DocumentHandler.ignorableWhitespace > (true) or explicitly do not require it (false). I'd like to see "http://xml.org/sax/features/xsl-transformation" as well. Anyway, all of the above seem to fall nicely into the pipeline framework. >* What are the interfaces between components and how hard are they to >implement? Basically the SAX callbacks, probably extended so that the full document data is available (comments and so on). This seems pretty much a done deal. >* How independent are the features? >* Are there order dependencies between components? This is a problem, as I've already pointed out. Take "normalize-text", for example. The effects of such a filter might be lost if it is followed by any of the entity expansion filters (say), not to mention an XSL one. However most of the other features seems relatively independent. I'd say this isn't a fatal problem. It definitely doesn't effect the API I suggested. >* Are performance penalties too high to separate features into separate >components? Unknown; I guess this depends on the feature and the implementation. But then, allowing one to build a system by combining filters doesn't mean one has to do so. Even inefficient pipelines are still very useful for ad-hoc processing, for prototyping systems, and so on. From the list of features above, I'd say that most won't suffer a serious penalty. >* Who assembles the components -- the application, the processor, or a >third party? What I'm suggesting is we currently answer "for now, the application", and provide a simple, lightweight, low-level API which allows it to do so. More complex solutions could evolve later on. This seems to be in the SAX spirit. >My personal feeling is that assembling XML processors completely on the fly >is a pipe (if you will excuse the pun) dream. The world is simply not o >rthogonal enough to make this possible. Furthermore, there are too many >performance gains to be had by tight integration of functionality to ever >convince people to build things entirely as components with public >interfaces. Simon St.Laurent has made a good case for layering XML functionality - see http://www.simonstl.com/articles/layering/layered.htm. The list of features above seems to validate his claims. My feeling is that pipelining is a valid approach. This is because there are quite a few features which fit this model, and each application needs its own special subset of them. If this weren't the case, we'd be designing SAX2.0 with a fixed set of features instead of ModSAX. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 16:06:16 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:53 2004 Subject: Oedipus XML (was Re: Namespaces and DTDs) In-Reply-To: <36E7AE27.7DE6@hiwaay.net> References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> Message-ID: <14055.59156.593634.998329@localhost.localdomain> len bullard writes: > Jarle Stabell wrote: > > > > Richard Goerwitz wrote: > > > (Simplicity _was_ one of XML's primary goals back in the dark ages last > > > February.) > > > > It seems to me that the SGML compatibility requirement killed simplicity. > > (And gave a very confusing and hard-to-learn vocabulary) > > Or its inventors have discovered that assuming the mission of an > existing mature standard without acknowledging the complexity of > that mission leads to the same or worse complexity in the > invention. XML has introduced some nasty new complexities, but many of those relate to providing proper Unicode support, and SGML would have had to deal with them anyway. (There were, of course, a couple of mistakes that added to the complexity, especially relating to entities and external subsets.) Speaking as both a parser writer and an application writer, I am confortable writing that XML is significantly simpler to support in enterprise-level implementations than full SGML, and that I have not actually yet really missed any of the SGML features excluded from XML. To be fair, I am talking only about the core specs -- I am comparing ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral standards on both sides. A comparison of HyTime to XLink, XPointer, and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an interesting but separate exercise. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 16:37:20 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:53 2004 Subject: Namespaces and DTDs Message-ID: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> From: len bullard >Darn. Maybe LISP was the right language after all and forty years >of computer scientists just didn't "get it". Lisp and XML have a few things in common, like being easy to determine if they are well formed. Frankly, I think XML will be better in the long run because it can be validated against various schema. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 16:40:52 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: One more ModSax naming try... In-Reply-To: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> References: <002a01be6bbb$3da914a0$c8a8a8c0@thing1> Message-ID: <14055.61833.969345.509241@localhost.localdomain> Bill la Forge writes: > What about open? OpenParser/OpenSAX. > With the new extensions, we are not constrained by the interface--its quite "open". Not bad, but we weren't really closed to begin with. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 16:41:45 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> Message-ID: <36E7F1DB.608598CC@goon.stg.brown.edu> "Roger L. Costello" wrote, re elements like "creator" (which may not be defined by a given DTD, but which must occur in a document instance that is using RDF): > My agent has never seen the property "creator", so it follows the > namespace URI to the property schema... Okay, so your agent is reading the document. It runs into an element in another RDF namespace. You want to use that namespace's URI component to read in additional schema information. Two problems: 1) namespace URIs don't necessarily point to schemas, and 2) if they did, you'd be extending the schema mechanism in a way that's incompatible with DTDs, as they're normally defined and understood. I don't know if its possible, from an implementation standpoint, to add the DTD after you've already started parsing the document. And if you to could, whether doing so would be reasonable. Surely this sort of problem has been discussed in the SGML community. Can someone who has hashed all the details out already perhaps post with some commentary? -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Bruce.Duffy at westgroup.com Thu Mar 11 16:47:20 1999 From: Bruce.Duffy at westgroup.com (Duffy, Bruce) Date: Mon Jun 7 17:09:54 2004 Subject: ModSAX: Proposed Core Features Message-ID: <7BA102761CAED111B27E00805FBB72333FAE4C@arrowhead.int.westgroup.com> Hi folks, One feature I'd really like to see is a Locator.getByteOffset() method. Obviously this feature would have to be optional, since not all XML inputs are indexable files. James Clark's non-SAX API for XP implements this method for startElement(), but not for the characters() callback, which unfortunately is exactly what I need it for. I could hack XP or another parser, but I'd much rather work within the context of SAX. One name for such a feature is: http://xml.org/sax/features/locator.byteOffsets (true) means getByteOffset() is supported for startElement, endElement, and character callbacks. (false) means it is not supported for those callbacks. Alternatively, if there's some reason why this feature is a Bad Idea, I'd like to know why! Thanks, Bruce Duffy West Group xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From creitzel at mediaone.net Thu Mar 11 16:58:56 1999 From: creitzel at mediaone.net (Charles Reitzel) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <199903111654.LAA04302@chmls06.mediaone.net> Ah, my favorite thing to hate about XML . Seriously, though. I have yet to hear of a single real application that needs element level prefix declarations. Not one! The PI was just fine for 99.99% of applications. The 0.01% should simply not use XML (or may need an additional layer, such as AF or a schema processor). Element declared namespaces is a solution in search of a problem. Unfortunately, namespaces have effectively killed DTD validation. My wish list for namespaces is as follows: 1) The prefix should be set by document author, *not* the DTD author. 2) The FPI should be set by the DTD author. 3) Prefixes should have document scope. 4) Namespaces should be part of XML proper and *not* an add on. 5) Element names should be resolved in the namespace of the nearest ancestor. Until most of these conditions are met, I predict the demise of DTD's. It may be too late already... Best regards, Charles Reitzel >From: james anderson >Date: Wed, 10 Mar 1999 11:49:51 +0100 >Subject: Re: Namespaces and DTDs > >That "REC-xml-names-19990114" does not provide any means to establish >prefix<->uri bindings for a DTD has long been a point of contention. A cursory >search of the archives will bear this out. The decision to eliminate the >combined prefix/uri/dtd binding (the original pi form) was, however, correct, >as the pi form, at least as proposed in "WD-xml-names-19980327", would not >have been sufficient to handle such things as a dtd which needs multiple >prefix bindings or the situation where a given prefix<->uri binding is to >apply to multiple schema sources. > >While it is true that some mechanism is necessary, a form - as discussed below >- - which effected a singular binding would also not have solved the problem. >"Everyone" would seem to be waiting for "schemas".... > >Marc.McDonald@Design-Intelligence.com wrote: >> >> A simple extension to namespaces could have fixed this problem: >> 1. Allow a DTD to be optionally specified along with the namespace >> prefix and URI >> 2. When an element is prefixed, parse it using the DTD associated with >> the namespace and the given prefix as the default. >> 3. If no DTD is associated with the prefix or not validating, do what >> is done now (ensure element is well-formed). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Mar 11 17:16:18 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <00dc01be6be1$c4587e20$0b2e249b@fileroom.Synapse> Bill la Forge wrote: >From: len bullard >>Darn. Maybe LISP was the right language after all and forty years >>of computer scientists just didn't "get it". > > >Lisp and XML have a few things in common, like being easy to >determine if they are well formed. Frankly, I think XML will be >better in the long run because it can be validated against various >schema. > LISP defines a serialization format for lists and atoms (s-expressions) which employs '(' and ')' in an analogous fashion to XML being a serialization format for trees. LISP also defines a set of rules by which lists are eval'd as functions with arguments. Aside from syntactic issues, '<' and '>' could be used as s-expression delimiters without significant change to the LISP interpreter (aside from the parsing routine). In order to properly compare LISP with XML, then, we would need to propose a set of rules whereby *x-expressions* were evaluated. The closest we have today is XSL which is not currently a fair comparison to LISP (e.g. try writing a compiler or word processor in XSL :-)) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Mar 11 18:03:04 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:54 2004 Subject: ModSAX: Proposed Core Features Message-ID: <01BE6BF1.B0427BB0@grappa.ito.tu-darmstadt.de> Oren Ben-Kiki wrote: > >* What are the interfaces between components and how hard are they to > >implement? > > Basically the SAX callbacks, probably extended so that the full document > data is available (comments and so on). This seems pretty much a done deal. and also wrote: > >* Who assembles the components -- the application, the processor, or a > >third party? > > What I'm suggesting is we currently answer "for now, the application", and > provide a simple, lightweight, low-level API which allows it to do so. More > complex solutions could evolve later on. This seems to be in the SAX spirit. If the application assembles the components and the interface between them is SAX, what do we need that SAX filters don't already give us? In other words, does anything need to be done to OpenSAX (best name so far) to support this besides adding the ParserFilter interface? The other question that occurs to me is how useful/common it is to dynamically assemble a processor at run time. That is, are there really applications (outside of test environments) that allow the user to designate their parser at run time (or even installation time) and therefore need to cover any possible deficiencies in the chosen parser? What is gained by allowing the user to choose the parser? Note that this is a very different situation from, say, using different ODBC drivers. In the case of ODBC drivers, you are choosing a different source of data (type of database) and application writers have a strong incentive to support multiple databases through ODBC. In the case of XML, the source of data is always the same XML document and the choice of parser becomes a trade-off between speed, reliability, feature-set, etc. Since the application writer knows the feature set ahead of time, why not just hard-code the required parser and SAX filters and be done with it? (Yes, I know that "hard-code" is a bad word and I shudder as a write it, but I really am curious if anybody out there has a real-world application that allows users to change parsers and what the benefits of this are besides the ability to say, "Oh, look. I'm using a different parser.") In this view, the utility of SAX is not the ability to change parsers at run time, but to change them over time as reliability, speed, size, etc. of the parsers change. It also means that application writers can learn a single interface (SAX) and then choose parsers as they are appropriate to the application without having to learn different interfaces for different parsers. The ability to request features in OpenSAX allows the application to request processor behavior, which is slightly different from assembling a suitable parser. For example, if I have an application that doesn't need validation, but I the parser I want to use does validation by default, I would like to be able to turn that off. Just to be clear, I'm not necessarily against assembling processors based on a feature set. I just believe that it is far more complex than it appears at first glance and am not convinced that it's worth the trouble. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 18:03:35 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:54 2004 Subject: Opening SAX for better filter support Message-ID: <007701be6be9$f9bfc840$46026982@thing1.camb.opengroup.org> A fixed API has lots of advantages in terms of service/user. Each can be implemented to the API without being bound to the other. And if you do need a non-standard feature, you isolate the code that has such a dependency. Overall, a very manageable situation unless you move too far out of scope of the API. Introduce middleware and everything changes. Now you want an open API that permits unanticipated interactions between the service/user without needing to completely bypass the middleware. With the advent of SAX filters, we have now moved to having a need for a more open API, and David's proposal seems to fit that need precisely. Consider a complex of stacked and nested filters wrapping a parser. This composition is something which might be best done separately from the application itself, but the application may still need to access various parts. Indeed, a good design would keep as much of the application as possible independent of any particular structure, as the structure may need to change if we change parsers or introduce more appropriate filters. Think of this complex of parser and filters as some kind of aggregate that is best treated as a gray box by the application--the application may need to identify and interact with various parts of the aggregate, but doesn't know the overall structure. The new get and set methods are exactly what we need. We can present a named object to the aggregate and, by routing the request through the aggregate, the component which knows what to do with that object can process it. Conversely, we can request a reference to a component or result by name and the appropriate component is able to respond. Now while not of this may be terribly efficient, it doesn't need to be-- these are calls that are made for configuration or to access results. So it should work and work beautifully. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Mar 11 18:29:48 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs Message-ID: <00c501be6bed$96b33260$46026982@thing1.camb.opengroup.org> From: Jonathan Borden > The closest we have today is XSL which is not currently a fair >comparison to LISP (e.g. try writing a compiler or word processor in XSL >:-)) I like to use XML to do compositions of components, which encompases the declaritive rather than the proceedural aspects of programming. What I like is that a schema can then validate a composition, allowing clients to send a composition to a server to construct an agent, but without the security problems that you would otherwise have. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 11 18:36:24 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> Message-ID: <36E80E9C.FC04B18@locke.ccil.org> Roger L. Costello wrote: > Okay, here's where the rub comes. Let me give you a couple of quotes > from the RDF spec (the *'s I have put in and are my way of emphasizing > the words that I wish for you to really focus on): "Property names > *must* be associated with a schema. This can be done by qualifying the > element names with a namespace prefix to unambigously *connect* the > property definition with the corresponding RDF schema ..." Watch the modal verbs! Property names *must* be associated with a schema, but this can (i.e. *may*) be done by making the URI to which the namespace prefix is bound the actual URI of the schema document. There may be other ways to do it. Besides, RDF is free to set tighter limits on the URIs used to identify namespaces than XML in general. XML-based standards can always set extra requirements, like the SMIL requirement (clause 5.1) that there be no internal DTD subset in SMIL documents. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Mar 11 18:53:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:54 2004 Subject: RDF not conforming to the Namespace spec? References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> <36E7F1DB.608598CC@goon.stg.brown.edu> Message-ID: <36E812D1.CE73AEE7@locke.ccil.org> Richard L. Goerwitz wrote: > Okay, so your agent is reading the document. It runs into an element > in another RDF namespace. You want to use that namespace's URI component > to read in additional schema information. > > Two problems: 1) namespace URIs don't necessarily point to schemas, and > 2) if they did, you'd be extending the schema mechanism in a way that's > incompatible with DTDs, as they're normally defined and understood. RDF namespace declarations *may* (and even perhaps should) point to RDF schemas, which are not XML schemas at all. They declare RDF classes and properties, not XML elements and attributes. Both RDF statements and RDF schemas are normally represented in XML, but other representations (graphical) also exist. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 19:27:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: Fw: ModSAX: Proposed Core Features In-Reply-To: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> References: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il> Message-ID: <14056.6147.757421.124783@localhost.localdomain> Oren Ben-Kiki writes: > I'd like to see "http://xml.org/sax/features/xsl-transformation" as > well. Anyway, all of the above seem to fall nicely into the > pipeline framework. How about "http://capella.co.il/~oren/sax/features/xsl-transformation" (or whatever is suitable for your web rights)? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 19:28:45 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:54 2004 Subject: FW: Namespaces and DTDs Message-ID: Hi I am using these simple rule of thumb: a) a XML DTD is useful for XML editors not for XML renderers b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) c) a XML interpreter do not need a DTD (something else than rendition) If I need a DTD at the receiving end, then I am now no longer in the XML world but in the SGML world because the receiving end needs a validating parser. Several SGML parser like for instance SP can parse XML simplifyed DTD. The only simplification I gained is the -- or -0 think called omitags. Therefore, because I have to include a DTD for validation, better use then a SGML format. However, on the Web, to reduce complexity, I should not assume that the receiving end has a validating parser. Thus, because my XML document has been validated with my XML editor or by any other validation program. The receiving end makes the reasonnable assumption that if the docuement is a XML docuement it is "well formed" and valid. Its a lot simplier that way. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Mar 11 19:31:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs In-Reply-To: <199903111654.LAA04302@chmls06.mediaone.net> References: <199903111654.LAA04302@chmls06.mediaone.net> Message-ID: <14056.6319.57210.877490@localhost.localdomain> Charles Reitzel writes: > Seriously, though. I have yet to hear of a single real application > that needs element level prefix declarations. Not one! I'll paraphrase the use case as follows (I'll leave the source anonymous): A server wants to construct a large XML document as the response to a client request, and it does so by handing off the work to several parallel processes and then concatenating the results into a single document. If each of the processes can declare its own namespaces, then it is not necessary to establish complicated negotiation channels between the top-level process and the child processes to obtain the correct namespace declarations. Before everyone rushes out to shoot holes in this use case, I'd like to note that I still have callouses on my trigger finger from doing so myself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 19:57:40 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:54 2004 Subject: Namespaces and DTDs In-Reply-To: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> Message-ID: HI Bill, Lisp and XML have a few things in common, like being easy to determine if they are well formed. Frankly, I think XML will be better in the long run because it can be validated against various schema. I am not sure of that. a) a Lisp document could be made SGML compliant because SGML can let you define begin and end tag's delimiters (Ex: dsssl). b) if the previous proposition is true, then you can also change the delimiters and keep the structural coherency. c) You could also enforce that a begin and end tag conform to the well formed constraint. d) a XML document is a hierarchy and a hisrarchy could be mapped with list constructs. In fact, as soon as you map lisp to SGML and then to XML, you notice immediately the similarities. There is formal transformation possible from one structure to the other. In mathematical term would coud talk of "topological" transformation from one to the other. Their structure are similar enough to transform one into the other. Conclusion: we should not take what Jonathan said so lightly and do some homework fisrt. This said, I agree that XML could potentially be more succesful than lisp or SGML or (fill here less than popular good ideas) but this is for other reasons than technical reasons. For instance, this could be very popular because the web is popular and XML benefit form the aura effect. Also because, important software manufacturer are behind it and put compliant products on the market. Also because poeple don't want to miss the next Web big success, etc... This has nothing to do with technical vertues but more with marketing vertues. But surely not because XML is bettern than lisp because it could be validated against different schemas. a) XML has the advantage, because of its strict syntax (compared to SGML omitags) that a receiver do not need to validate the structure to interpret the XML document. In fact, there is a high probability that interpreters would "hard code" in some ways what to do for each element and this without the need of a DTD. (except for style language that will "hard code" tree manipulation and formatting object model) b) If a DTD is necessary why not use SGML except for a marketing advantage then? c) An otehr usage of XML is to separate the content from the rendition. In this case, most of browsers' style engine won't contain a validating parser and therefore validation mechanism is irrelevant. Conclusion: XML will be better simply because it has marketing momentum not because of its technical merits period. The whole difference between SGML and XML is that the receiver do not necessarily need validation to interpret the document (because of the "well formed" constraint). But from the marketing point of view it has huge advantage. New domain languages could be created and big software manufacturers could again regain some control by creating a domain language and let the numbers create a de facto standard. In fact, HTML by being a standard domain language is more a threat to big manufacturer than XML is. So, if XML is to be more popular this is surely for marketing reasons :-) Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Mar 11 20:27:45 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:55 2004 Subject: RDF not conforming to the Namespace spec? In-Reply-To: <36E7F1DB.608598CC@goon.stg.brown.edu> Message-ID: Hi Okay, so your agent is reading the document. It runs into an element in another RDF namespace. You want to use that namespace's URI component to read in additional schema information. Two problems: 1) namespace URIs don't necessarily point to schemas, and 2) if they did, you'd be extending the schema mechanism in a way that's incompatible with DTDs, as they're normally defined and understood. I don't know if its possible, from an implementation standpoint, to add the DTD after you've already started parsing the document. And if you to could, whether doing so would be reasonable. Surely this sort of problem has been discussed in the SGML community. Can someone who has hashed all the details out already perhaps post with some commentary? You're right on this. For RDF the validation mechanism is not on the name space. The name space mechanism form the receiver point of view is to be seen as a way to prevent name collision in the same document space (including document linked to the document). A simple parser could then process the complete markup name as a whole word (i.e.. MySameSpace:MymarkupName). It could occur however that two name space would collide (i.e. two name space have the same name space id and the same markup name) then in this case, the parser may not take any chance and replace the name space ID by the URI (if the URI is unique) and be sure that now the element name is unique (i.e. :MyMarkupName). The whole thing is to be sure that we do not have name collision in the document name space (I mean here the document complete set of names). For RDF the property list is defined by a schema. RDF is like directory service schemas. a) you have to define a record or property set with a schema. You also define entities relationship with the schema. The parser do not have to use a DTD as a validation mechanism just the trick to replace the name space ID by the URI if we want to reduce name collision to near zero probability. However this is not a validation mechanism this is a name space collision resolution mechanism like for instance used in languages like C++ (practically, you replace the name space ID by the URI to create a unique name element, not more not less -> MyNSID:MyElementName into http://www.netfolder.com/:MyElementName This is now a very low probability that a linked document would contain the same named element.) This is for the parsing side. Now for the interpretation side, a RDF interpreter (that uses a XML parser) has to know the object's property set to do something on it. This something could be to build a "frame" for this object. A frame, to recall, is like a record. This frame could be strongly typed by a schema that says what the frame is allowed to contain and what relationship it has with other frames. A schema is not a DTD because the validation is not at the syntax level but at the interpretation level. Let's take an example: we want to import data into a directory service and to do so, we use RDF. To be sure that the XML parser won't have any name collision we could use name space otherwise if the document name space is controlled the usage of name spaces is superfluous. Thus let have a directory record for a user on a network. Albert Einstein etc.... The XML parser has enough to do its job but the RDF interpreter now needs to know what is the "frame" schema or object category constraints. Thus, the RDF interpreter can ask the XML parser to parse the xml based schema document to know the "frame" constraints. After the parsing done, it can compare each frame property with the schema to know if the "frame" is valid of not. It could also add a new schema to the directory service if the object category is new to the directory. Conclusion: The schema stuff is useful for the interpreter not the syntax parser which in this case is the XML parser. We have to keep in mind that XML is for the syntax and other mechanism may have to be provided to the syntax parser client: the interpreter. A RDF interpreter then use XML parser to convert into a structure than could be manipulated by the parser: a) the RDF document b) the schemas then "interpret" what to like for instance import data into a directory service. A XML document is like a sleeping beauty without an interpreter :-) Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Mar 11 20:33:41 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs References: <199903111654.LAA04302@chmls06.mediaone.net> <14056.6319.57210.877490@localhost.localdomain> Message-ID: <36E8285F.DEB5CE9@goon.stg.brown.edu> David Megginson wrote, re why namespaces are needed: > I'll paraphrase the use case as follows (I'll leave the source > anonymous): > > A server wants to construct a large XML document as the response to > a client request, and it does so by handing off the work to several > parallel processes and then concatenating the results into a single > document. If each of the processes can declare its own namespaces, > then it is not necessary to establish complicated negotiation > channels between the top-level process and the child processes to > obtain the correct namespace declarations. Maybe it's just me, but this sort of statement would have more credi- bility if there were more evidence of widespread practical application of this technique. Most successful standards are based, in large part, on experience and wisdom people gain from actually doing a thing. A lot. Am I missing something here? -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Thu Mar 11 20:34:30 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs In-Reply-To: Message-ID: <199903112033.PAA24971@ruby.ora.com> [Didier PH Martin] > a) a Lisp document could be made SGML compliant because SGML can let > you define begin and end tag's delimiters (Ex: dsssl). I think there's a little confusion here about DSSSL. DSSSL stylesheets are SGML documents, but they usually use angle-brackets: (default (make sequence)) The parentheses are only character data. I don't think that Lisp could be made SGML compliant; the delimiters could be redefined, but as Steve DeRose notes in _The SGML FAQ Book_, there are some limits to the flexibility of the redefinitions, since some delimiter roles are overloaded. Also, Lisp doesn't have the equivalent of start-tag close, and you can only omit tagc if the next character is stago or etago (ISO 8879:1986, clause 7.4.1.2) which it wouldn't be when you get to the leaves of a structure. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Thu Mar 11 22:03:18 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:55 2004 Subject: Java DOM Parsers Message-ID: <008901be6c09$465da400$0300000a@othniel.cygnus.uwa.edu.au> >What companies supply java DOM API's and other xml api tools? Any >suggestions on which to go with? I'll leave others to suggest which to go with. But for a list, see: http://www.xmlsoftware.com/utilities/ http://www.xmlsoftware.com/parsers/ James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Thu Mar 11 22:20:05 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs Message-ID: So make a namespace declaration a PI and add an "not using this namespace anymore" PI. Then use simple occurrence scoping: Process result: .... Process gets to define the prefixes that override any previous definition, old definition (if any) restored by XMLENDNS. No problem with concatenation. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: David Megginson [SMTP:david@megginson.com] Sent: Thursday, March 11, 1999 11:30 AM To: XML Developers' List Subject: Re: Namespaces and DTDs Charles Reitzel writes: > Seriously, though. I have yet to hear of a single real application > that needs element level prefix declarations. Not one! I'll paraphrase the use case as follows (I'll leave the source anonymous): A server wants to construct a large XML document as the response to a client request, and it does so by handing off the work to several parallel processes and then concatenating the results into a single document. If each of the processes can declare its own namespaces, then it is not necessary to establish complicated negotiation channels between the top-level process and the child processes to obtain the correct namespace declarations. Before everyone rushes out to shoot holes in this use case, I'd like to note that I still have callouses on my trigger finger from doing so myself. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Mar 12 00:44:34 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:55 2004 Subject: DocumentHandler with xml4j DOMParser In-Reply-To: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK> Message-ID: <000301be6c20$a6e177e0$d3228018@jabr.ne.mediaone.net> Perhaps the confusion is this: A DocumentHandler is a SAX concept, not a DOM concept. The DOMParser contains a DocumentHandler that builds a DOM tree from the source document. If you are working with the DOM, then you will parse the document and then access its members through the DOM interfaces. If you would rather process using an event based interface, then use SAX directly i.e. the SAXParser. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at squiz.co.nz Fri Mar 12 02:42:04 1999 From: andrew at squiz.co.nz (Andrew McNaughton) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs In-Reply-To: Your message of "Thu, 11 Mar 1999 14:19:06 -0800." Message-ID: <199903120241.PAA10692@aniwa.sky> Documents resulting from queries run on the concatenated document would tend to cause problems, as query results don't generally return the context of the XML elements returned. This problem also applies to queries run across multiple documents unless their DTD's are identical, which perhaps suggests that an answer to this problem has to come from the query languages. Andrew McNaughton Marc.McDonald@Design-Intelligence.com wrote: > So make a namespace declaration a PI and add an "not using this > namespace anymore" PI. Then use simple occurrence scoping: > > Process result: > > .... > > > Process gets to define the prefixes that override any previous > definition, old definition (if any) restored by XMLENDNS. No problem > with concatenation. > > > Marc B McDonald > Principal Software Scientist > Design Intelligence, Inc > www.design-intelligence.com > > > ---------- > From: David Megginson [SMTP:david@megginson.com] > Sent: Thursday, March 11, 1999 11:30 AM > To: XML Developers' List > Subject: Re: Namespaces and DTDs > > Charles Reitzel writes: > > > Seriously, though. I have yet to hear of a single real > application > > that needs element level prefix declarations. Not one! > > I'll paraphrase the use case as follows (I'll leave the source > anonymous): > > A server wants to construct a large XML document as the response to > a client request, and it does so by handing off the work to several > parallel processes and then concatenating the results into a single > document. If each of the processes can declare its own namespaces, > then it is not necessary to establish complicated negotiation > channels between the top-level process and the child processes to > obtain the correct namespace declarations. > > Before everyone rushes out to shoot holes in this use case, I'd like > to note that I still have callouses on my trigger finger from doing > so > myself. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > -- ----------- Andrew McNaughton andrew@squiz.co.nz http://www.newsroom.co.nz/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:13:49 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: Oedipus XML (TIe Your Mother Down) References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> <14055.59156.593634.998329@localhost.localdomain> Message-ID: <36E8A1E8.1B@hiwaay.net> David Megginson wrote: > > Speaking as both a parser writer and an application writer, I am > confortable writing that XML is significantly simpler to support in > enterprise-level implementations than full SGML, and that I have not > actually yet really missed any of the SGML features excluded from > XML. I agree with this. As an application writer who only has to parse parts of it and then in the context of using a relational system with XML editing, it looks very much the same to me as the simple features of SGML that I've always used. In effect, much of the nastier bits of SGML I did not use before. So, it looks much the same. It is a lot of fun to tie the treeviews, browser objects, tables, dialogs, combo boxes, etc. together into a generalized knowledge management system. Cheap too. ;-) > To be fair, I am talking only about the core specs -- I am comparing > ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral > standards on both sides. A comparison of HyTime to XLink, XPointer, > and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an > interesting but separate exercise. Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, the RDF Dublin Core, and namespaces influence my work. Learning to think beyond the DTD to the information properties of the metalanguage proves to be very useful and that is not something I did before. As activities like X3D ramp up, I find I am applying more and more of the wall-to-wall markup concepts from the middle years of SGML and they work in the XML infrastructure of tools. This is actually quite delightful. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:19:29 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs References: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org> Message-ID: <36E8A33A.43BB@hiwaay.net> Bill la Forge wrote: > > From: len bullard > >Darn. Maybe LISP was the right language after all and forty years > >of computer scientists just didn't "get it". > > Lisp and XML have a few things in common, like being easy to > determine if they are well formed. Frankly, I think XML will be > better in the long run because it can be validated against various > schema. As much as I resisted it in the early working groups for various reasons, I find myself agreeing with the position that it is good to have formal definitions for both wrll-formed and validated information. I had worked in that mode in the IDE/AS, IADS and GE systems, but the notion wasn't formally expressed. I like ISO 8879 DTDs mainly because they are for me, much easier to read and use to parse in my head. As I implement more with relational systems and use the tables to store the property sets of both schemata, properties of schemata as well as instances, I think I have more insight now into why people want multiple schema types even without the obvious extensions such as inheritance. nodes is nodes is nodes. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 12 05:26:40 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:55 2004 Subject: FW: Namespaces and DTDs References: Message-ID: <36E8A4EB.375F@hiwaay.net> Didier PH Martin wrote: > > Hi > > I am using these simple rule of thumb: > > a) a XML DTD is useful for XML editors not for XML renderers > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) > > If I need a DTD at the receiving end, then I am now no longer in the XML > world but in the SGML world because the receiving end needs a validating > parser. Several SGML parser like for instance SP can parse XML simplifyed > DTD. The only simplification I gained is the -- or -0 think called omitags. > Therefore, because I have to include a DTD for validation, better use then a > SGML format. > > However, on the Web, to reduce complexity, I should not assume that the > receiving end has a validating parser. Thus, because my XML document has > been validated with my XML editor or by any other validation program. The > receiving end makes the reasonnable assumption that if the docuement is a > XML docuement it is "well formed" and valid. That's mostly true because web documents don't stick around. In cases where information is moving across multiple processes or sits in some long term archival, it is very handy to be able to validate it on the receiving end. This will become more apparent to the XML community when they get to do the sort of work the SGML community did a decade after the first SGML applications fielded instances. Things change. Finding those changes quickly is the key to cheap rehosting. In my experience, if DTDs die, someone gets to reinvent them and it will be painful. Otherwise, yes, the DTD is much more useful in the editor in the initial part of the information lifecycle. len > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 12 08:29:31 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:55 2004 Subject: Namespaces and DTDs Message-ID: <01BE6C6A.B2052F50@grappa.ito.tu-darmstadt.de> Didier PH Martin wrote: > a) a XML DTD is useful for XML editors not for XML renderers > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) (c) is not always true because DTDs are used for more things than just validation. For example, DTDs are used to define internal general entities, attribute defaults, and attribute types. (The latter is important, for example, if a processor expects to build links based on ID/IDREF attributes or process according to notations.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Fri Mar 12 08:44:10 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:55 2004 Subject: XML DATA Message-ID: <3601bf97.120299@smtpgate1.ONE2ONE.CO.UK> Hi all, I would like to know the status of XML Data proposal and it's take up in the XML community. Currently the only XML data parser that i found is from MS. Does anyone else plan on supporting XML Data in the future? If not, what is the alternative to XML DATA ? adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Mar 12 09:16:47 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:55 2004 Subject: XML DATA Message-ID: <01BE6C71.4DE32A70@grappa.ito.tu-darmstadt.de> LUCIO PICOLLI wrote: > I would like to know the status of XML Data proposal and it's take up in > the XML community. Currently the only XML data parser that i found is > from MS. Does anyone else plan on supporting XML Data in the future? > If not, what is the alternative to XML DATA ? Outside of the Microsoft parser, XML Data is probably dead. There are three other schema proposals (SOX, DCD, and DDML) and the W3C is currently working on their own. XML Data is significant in that it seems to be the only schema language that is publicly supported by a parser. You can find the various schema language specs at: SOX: http://www.w3.org/TR/NOTE-SOX/ DCD: http://www.w3.org/TR/NOTE-dcd DDML: http://www.w3.org/TR/NOTE-ddml XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/ W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req and an overview of the various schema languages at: http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ index.htm OR http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ XMLSchemas.ppt -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Mar 12 09:19:58 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:55 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <00f501be6c68$bd213020$5402a8c0@oren.capella.co.il> Ronald Bourret wrote: >If the application assembles the components and the interface between them >is SAX, what do we need that SAX filters don't already give us? In other >words, does anything need to be done to OpenSAX (best name so far) to >support this besides adding the ParserFilter interface? Yes. One needs to _locate_ the necessary filters. Hence the registry, the query-for-a-feature, etc. >The other question that occurs to me is how useful/common it is to >dynamically assemble a processor at run time. That is, are there really >applications (outside of test environments) that allow the user to >designate their parser at run time (or even installation time) and >therefore need to cover any possible deficiencies in the chosen parser? > What is gained by allowing the user to choose the parser? If there aren't, why bother with ModSAX at all? If I know exactly which class is used, I also know exactly which features it provides, right? The whole point of ModSAX is that this isn't the case. Think of it like this: XML support is not the same on all platforms. Sometimes there's a built-in SAX parser. It may or may not support some features. Sometimes there's an XSL processor. And so on. I'm talking about platforms existing today, or "real soon now" - IE5, server packages, etc. I want to write code which is _reasonably_ portable to such platforms. I accept the remark that a full-scale solution is beyond the scope of ModSAX. What I suggested is an interface in the spirit of SAX (I hope) - lightweight, simple, low-level, which allows future layering of higher-level solutions. >Note that this is a very different situation from, say, using different >ODBC drivers. In the case of ODBC drivers, you are choosing a different >source of data (type of database) and application writers have a strong >incentive to support multiple databases through ODBC. In the case of XML, >the source of data is always the same XML document and the choice of parser >becomes a trade-off between speed, reliability, feature-set, etc. On the contrary, I see it as vbeing very similar to using ODBC drivers. ODBC drivers vary in their capabilities, and therefore have a mechanism for querying for particular features. So do XML components. There might be any number of ODBC drivers available in a particular system. Same for XML components. And you typically have a pretty good idea of which ODBC driver you are going to use. Same for XML components. The last point doesn't invalidate the first two. BTW, have you ever tried to write a non trivial program which would work with any ODBC driver? I have. You have to at least negotiate its capabilities, find a match for your needs, and then the problems start - it doesn't like this join syntax, it can't do this particular form of query... You end up writing an adapter class which knows the particular nastiness of the particular driver. Of course this is due to SQL being such a weak standard; XML should be better in this regard - if we insist on well-defining features, that is. >Since the application writer knows the feature set ahead of time, why not >just hard-code the required parser and SAX filters and be done with it? > (Yes, I know that "hard-code" is a bad word and I shudder as a write it, >but I really am curious if anybody out there has a real-world application >that allows users to change parsers and what the benefits of this are >besides the ability to say, "Oh, look. I'm using a different parser.") Mine. I run on both IE5 ("hey, look, there's a built in XSL processor") and IE4 ("oh well, let's use XT"), not to mention some server platforms I'm considering. I'm also tentatively considering other XML features - namespaces and embedding. I doubt I'm unique in this regard. And as XML support starts crawling into popular platforms (examples abound), this would become more and more common. At least we hope so :-) >In this view, the utility of SAX is not the ability to change parsers at >run time, but to change them over time as reliability, speed, size, etc. of >the parsers change. It also means that application writers can learn a >single interface (SAX) and then choose parsers as they are appropriate to >the application without having to learn different interfaces for different >parsers. That's one view and a valid one. It shouldn't prevent the other one. >The ability to request features in OpenSAX allows the application to >request processor behavior, which is slightly different from assembling a >suitable parser. For example, if I have an application that doesn't need >validation, but I the parser I want to use does validation by default, I >would like to be able to turn that off. Right. I didn't suggest that the original question ("which features are supported") isn't important. What I suggested is that the second question ("how do I find a filter/parser which does X") is also important. If it wasn't, why do we have a ParserFactory class in SAX? BTW, I'm not happy with this "parser" fixation. SAX is an interface which allows processing an XML tree. I don't see why the special case ("input: text; output: SAX events") is any different then "input: DOM; output: SAX events", for example. That's why "org.xml.sax.parser" is just another "feature" in the API I suggested. "org.xml.sax.visitor" and "org.xml.sax.builder" would be on equal grounds. IMVHO, converting DOM to SAX and back is something which we will have to deal with. >Just to be clear, I'm not necessarily against assembling processors based >on a feature set. I just believe that it is far more complex than it >appears at first glance and am not convinced that it's worth the trouble. I think I've answered the complexity issue - the API I've suggested is anything but. It merely provides the basic building blocks. The application may be as complex or as simple as you want. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Mar 12 09:28:26 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:55 2004 Subject: Fw: ModSAX: Proposed Core Features Message-ID: <00fa01be6c69$e7a44160$5402a8c0@oren.capella.co.il> David Megginson wrote: >I wrote: > > I'd like to see "http://xml.org/sax/features/xsl-transformation" as > > well. Anyway, all of the above seem to fall nicely into the > > pipeline framework. > >How about "http://capella.co.il/~oren/sax/features/xsl-transformation" >(or whatever is suitable for your web rights)? I kind of doubt that any XSL processors would register themselves under this name :-) I don't think that implementations of standard W3C features should be under private names. Come to think of it, if we go the URI way (which I'm not happy with since it can't be used as a property name), the "right" URI is a pointer to the relevant W3C standard. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Mar 12 09:51:35 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:56 2004 Subject: FW: Namespaces and DTDs References: Message-ID: <36E8E733.7F84EA5E@mecomnet.de> Didier PH Martin wrote: > > I am using these simple rule of thumb: > > a) a XML DTD is useful for XML editors not for XML renderers if one presumes this, then one loses the ability to use attribute defaults and, thereby, for example, the chance to use "architectural" techniques. > b) Most XML renderers (XSL, CSS or DSSSL won't do document validation) > c) a XML interpreter do not need a DTD (something else than rendition) > > If I need a DTD at the receiving end, then I am now no longer in the XML > world but in the SGML world because the receiving end needs a validating > parser. these techniques do note presume validation, just the availability of attribute declarations. > Several SGML parser like for instance SP can parse XML simplifyed > DTD. The only simplification I gained is the -- or -0 think called omitags. > Therefore, because I have to include a DTD for validation, better use then a > SGML format. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Mar 12 10:01:05 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:56 2004 Subject: Namespaces and DTDs References: <199903120241.PAA10692@aniwa.sky> Message-ID: <36E8E947.5B6EF6B7@mecomnet.de> of all things, the "context problems" would be minimal, so long as the decoding process maps the prefixed identifiers to universal identifiers. (which as best i can surmis all the "standard" parsers do.) the application wouldn't care where they came from and any reserialization would be responsible to get its own declarations in order. [i'm not arguing for this declaration form, just noting that it doesn't make the problem any more complex.] Andrew McNaughton wrote: > > Documents resulting from queries run on the concatenated document would tend > to cause problems, as query results don't generally return the context of the > XML elements returned. This problem also applies to queries run across > multiple documents unless their DTD's are identical, which perhaps suggests > that an answer to this problem has to come from the query languages. > > Andrew McNaughton > > Marc.McDonald@Design-Intelligence.com wrote: > > So make a namespace declaration a PI and add an "not using this > > namespace anymore" PI. Then use simple occurrence scoping: > > > > Process result: > > > > .... > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Fri Mar 12 10:53:41 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:09:56 2004 Subject: XML DATA Message-ID: <3601c191.120299@smtpgate1.ONE2ONE.CO.UK> Thanks for info below Ronald. I am very interested in the SOX schema. I guess i'll might as asked a similar question as before. What is the take up of SOX in the XML community? What do other people use to mapping native data type in XML? -lucio > > I would like to know the status of XML Data proposal and > it's take up in > > the XML community. Currently the only XML data parser that > i found is > > from MS. Does anyone else plan on supporting XML Data in the future? > > If not, what is the alternative to XML DATA ? > > Outside of the Microsoft parser, XML Data is probably dead. > There are > three other schema proposals (SOX, DCD, and DDML) and the W3C > is currently > working on their own. XML Data is significant in that it > seems to be the > only schema language that is publicly supported by a parser. > > You can find the various schema language specs at: > > SOX: http://www.w3.org/TR/NOTE-SOX/ > DCD: http://www.w3.org/TR/NOTE-dcd > DDML: http://www.w3.org/TR/NOTE-ddml > XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/ > W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req > > and an overview of the various schema languages at: > http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ index.htm OR http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/ XMLSchemas.ppt -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 11:38:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Oedipus XML (TIe Your Mother Down) In-Reply-To: <36E8A1E8.1B@hiwaay.net> References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no> <36E7AE27.7DE6@hiwaay.net> <14055.59156.593634.998329@localhost.localdomain> <36E8A1E8.1B@hiwaay.net> Message-ID: <14056.64392.235652.11594@localhost.localdomain> len bullard writes: > > To be fair, I am talking only about the core specs -- I am comparing > > ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral > > standards on both sides. A comparison of HyTime to XLink, XPointer, > > and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an > > interesting but separate exercise. > > Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, > the RDF Dublin Core, and namespaces influence my work. Learning to > think beyond the DTD to the information properties of the metalanguage > proves to be very useful and that is not something I did before. > As activities like X3D ramp up, I find I am applying more and more > of the wall-to-wall markup concepts from the middle years of SGML > and they work in the XML infrastructure of tools. This is actually > quite delightful. Precisely. The important point, though, is that none of these peripheral standards is hard-coded to SGML or XML. Although there are some minor lexical differences, in general you could apply Namespaces to SGML or HyTime to XML, you could use XSL to format an SGML document or DSSSL to format an XML document, etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rudman at idetix.com Fri Mar 12 14:02:05 1999 From: rudman at idetix.com (Dan Rudman) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <000701be6c90$8e31ee80$49e9fdce@diablo.idetix.com> I apologize for the basic question in advance :) With the wealth of XML libraries available, I am more and more inclined to make use of these libraries to help me create, parse, and utilize my own tag markup language to be embedded within an HTML document. My understanding of XML at this point is that it must be well-formed or a fatal error occurs. If this is the case, how can I deal with the fact that most HTML documents are NOT well-formed and that most HTML design tools do not enforce, require, or even sometimes support, well-formedness in a document? Things would be rosy if I didn't have to rely on HTML, but my application requires it. Thanks. -- Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990312/525211ab/attachment.htm From david at megginson.com Fri Mar 12 14:06:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12 Message-ID: <14057.7534.612787.424789@localhost.localdomain> Here's a revised list of the proposed core features for Mod??SAX. I've added one new feature -- http://xml.org/sax/features/use-locator -- which will explicitly request the parse to supply or not to supply a Locator through the DocumentHandler.setDocumentLocator callback. There are two possible advantages to including this feature: 1. If the application wants a locator, it can tell before beginning the parse whether the parser can supply one. 2. If the application does not want a locator, the SAX parser/driver might be able to operate more efficiently if it doesn't have to maintain the Locator information. What does everyone else think? In any case, here's the revised core feature list (I've also added extra wording to make it clear that the external DTD subset counts as an external parameter entity): ModSAX Core Features -------------------- $Id: features.txt,v 1.1 1999/03/12 13:57:54 david Exp $ http://xml.org/sax/features/validation Validate (true) or don't validate (false). http://xml.org/sax/features/external-general-entities Expand external general entities (true) or don't expand (false). http://xml.org/sax/features/external-parameter-entities Expand external parameter entities including the external DTD subset (true) or don't expand (false). http://xml.org/sax/features/namespaces Preprocess namespaces (true) or don't preprocess (false). See also the http://xml.org/sax/properties/namespace-sep property. http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). http://xml.org/sax/features/use-locator Provide a Locator using the DocumentHandler.setDocumentLocator callback (true), or explicitly do not provide one (false). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Fri Mar 12 14:27:48 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:56 2004 Subject: One more ModSax naming try... Message-ID: <85256732.004F61BA.00@D51MTA03.pok.ibm.com> For what it's worth, the approach the DOM has been looking at is that Level 2 classes which are subclasses/refinements of things that were present in Level 2 will be named by adding 2 as a suffix (Document2 et al). Simple, effective, extensible and indicates which version of the spec it refers to. Hence: SAX2? ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 14:45:24 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Feature Matrix Message-ID: <14057.8681.213869.498655@localhost.localdomain> It would be interesting to put together a feature matrix representing current practice among SAX parsers/drivers, at least in the Java world. Assuming that I simply wrapped the existing drivers with a Mod??Parser adapter, what features would and would not be supported? >From my fuzzy recollection, here's what AElfred supported just about a year ago before I gave it up: true false ------------------------------------------------------------------------ validation no yes external-general-entities yes no external-parameter-entities yes no namespaces no yes normalize-text yes no use-locator yes no (Wherever there's a 'no' answer, the driver should throw a SAXNotSupportedException). Actually, it would probably be safe to accept false for 'normalize-text' as well. If I were to wrap the AElfred driver, then, I'd do something like this (there's likely some kind of a static initialisation trap here, but it should be good enough as an unreliable example): public class AElfredModParser extends com.microstar.xml.SAXDriver implements org.xml.sax.ModParser { private static Hashtable featureTable = new Hashtable(); private static final Object TRUE = new Object(); private static final Object FALSE = new Object(); private static final Object TRUEFALSE = new Object(); private static final String FEATURE_NS = "http://xml.org/sax/features/"; static { featureTable.put(FEATURE_NS + "validation", FALSE); featureTable.put(FEATURE_NS + "external-general-entities", TRUE); featureTable.put(FEATURE_NS + "external-parameter-entities", TRUE); featureTable.put(FEATURE_NS + "namespaces", TRUE); featureTable.put(FEATURE_NS + "normalize-text", TRUEFALSE); featureTable.put(FEATURE_NS + "use-locator", TRUE); } public void setFeature (String featureID, boolean state) throws SAXNotSupportedException { Object allowedState = featureTable.get(featureID); if (allowedState == null) { throw new SAXNotRecognizedException(); } else if ((state && allowedState == FALSE) || (!state && allowedState == TRUE)) { throw new SAXNotSupportedException(); } } // etc. for setHandler, set, and get } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Mar 12 15:51:59 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:56 2004 Subject: Namespaces and DTDs References: Message-ID: <36E94F45.46D0E403@prescod.net> Didier PH Martin wrote: > > a) a Lisp document could be made SGML compliant because SGML can let you > define begin and end tag's delimiters (Ex: dsssl). I let this claim pass a couple of times because I didn't consider it important but now I feel the need to scratch that itch. DSSSL does not actually use parentheses as tags. If you use nsgmls to look at the SGML structure of a DSSSL document you will find that all of DSSSL's structure is actually in omitted tags. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Mar 12 16:01:22 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:56 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <14053.51113.676945.877507@localhost.localdomain> Message-ID: <199903121543.KAA13555@hesketh.net> After a few months of intense busy-ness (and business), plus a trip to XTech that I'm still recovering from, I'm finally catching up to ModSAX. Printed it all out, looked it all over, and mostly I'm very pleased. I think it'll make implementing my Layered Model proposal as a a Layered Parser much easier overall. One key thing is missing at this stage, and that's a feature. At 08:16 PM 3/9/99 -0500, David Megginson wrote: >Here's my revised version of the core feature list, based on recent >discussions: > > >ModSAX Core Features >-------------------- > >http://xml.org/sax/features/validation > Validate (true) or don't validate (false). > >http://xml.org/sax/features/external-general-entities > Expand external general entities (true) or don't expand (false). > >http://xml.org/sax/features/external-parameter-entities > Expand external parameter entities (true) or don't expand (false). > >http://xml.org/sax/features/namespaces > Preprocess namespaces (true) or don't preprocess (false). See also > the http://xml.org/sax/properties/namespace-sep property. > >http://xml.org/sax/features/normalize-text > Ensure that all consecutive text is returned in a single callback to > DocumentHandler.characters or DocumentHandler.ignorableWhitespace > (true) or explicitly do not require it (false). > We need: http://xml.org/sax/features/external-subset Requires the parser to load the external subset of the DTD and process it. (External parameter entities remain a separate issue referenced by a separate feature.) This is critically important for attribute defaulting, which makes things like XLink much much simpler. At one point I switched parsers, only to find that my attribute values in the external subset had all disappeared. I promptly jumped back. The spec (5.1) allows non-validating parsers to skip the external subset; I'd very much like to have a way to tell the parser not to skip it, or at least know that they are in fact being skipped. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Mar 12 16:45:04 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Dan Rudman >With the wealth of XML libraries available, I am more and more inclined to >make use of these libraries to help me create, parse, and utilize my own tag >markup language to be embedded within an HTML document. My understanding of >XML at this point is that it must be well-formed or a fatal error occurs. Yes, this is correct. >If this is the case, how can I deal with the fact that most HTML documents >are NOT well-formed and that most HTML design tools do not enforce, require, >or even sometimes support, well-formedness in a document? You might try Tidy as the initial step. Tidy can take bad HTML and spit out XML that could then be parsed by any XML parser. See http://www.w3.org/People/Raggett/tidy/ Hope this helps. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Mar 12 16:51:41 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:56 2004 Subject: Basic Question Message-ID: <008201be6ca8$3b56c380$32acdccf@ix.netcom.com> Dan, use XHTML which is well formed xml Frank ----- Original Message ----- From: Dan Rudman To: 'XML-DEV' Sent: Friday, March 12, 1999 8:59 AM Subject: Basic Question >I apologize for the basic question in advance :) > > >With the wealth of XML libraries available, I am more and more inclined to >make use of these libraries to help me create, parse, and utilize my own tag >markup language to be embedded within an HTML document. My understanding of >XML at this point is that it must be well-formed or a fatal error occurs. >If this is the case, how can I deal with the fact that most HTML documents >are NOT well-formed and that most HTML design tools do not enforce, require, >or even sometimes support, well-formedness in a document? > >Things would be rosy if I didn't have to rely on HTML, but my application >requires it. > >Thanks. > >-- Dan > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Mar 12 16:56:18 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:09:56 2004 Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12 In-Reply-To: <14057.7534.612787.424789@localhost.localdomain> References: <14057.7534.612787.424789@localhost.localdomain> Message-ID: * David Megginson | | I've added one new feature -- http://xml.org/sax/features/use-locator | [...] | What does everyone else think? Good one! I'm in favour. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Mar 12 17:13:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:56 2004 Subject: ModSAX: Proposed Core Features References: <199903121543.KAA13555@hesketh.net> Message-ID: <36E94AF4.8A154C60@locke.ccil.org> Simon St.Laurent wrote: > Requires the parser to load the external subset of the DTD and process > it. (External parameter entities remain a separate issue referenced by a > separate feature.) I don't see why it should be. I think that parsers will either process just the internal subset, or will load all external DTD parts, including both the external subset and the external parameter entities. (Ignoring the internal subset is *not* an option, of course, except for DPH parsers.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Fri Mar 12 17:40:28 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:57 2004 Subject: empty tags and the XMl 1.0 spec Message-ID: <36E97AED.3F44F7D1@metalab.unc.edu> >From the XML spec, section 3.1: "Empty-element tags may be used for any element which has no content, whether or not it is declared using the keyword EMPTY. For interoperability, the empty-element tag must be used, and can only be used, for elements which are declared EMPTY." 1. The "can only be used" part of the second sentence seems to contradict the the first sentence. 2. "the empty-element tag must be used...for elements which are declared EMPTY" seems to contradict the assertiona that and are the same thing. Is there any way out of this conundrum? -- Elliotte Rusty Harold xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From indiketr at churchill.co.uk Fri Mar 12 17:48:21 1999 From: indiketr at churchill.co.uk (Rajeeva Indiketiya) Date: Mon Jun 7 17:09:57 2004 Subject: unsubscribe xml-dev In-Reply-To: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 18:03:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:57 2004 Subject: ModSAX: Proposed Core Features In-Reply-To: <199903121543.KAA13555@hesketh.net> References: <14053.51113.676945.877507@localhost.localdomain> <199903121543.KAA13555@hesketh.net> Message-ID: <14057.21849.286930.749213@localhost.localdomain> Simon St.Laurent writes: > We need: > > http://xml.org/sax/features/external-subset I agree that this functionality is required. The question is whether there is a strong case for making inclusion of the external DTD subset separately configurable from the inclusion of external parameter entities in general. I'd suggest not. Consider the following: ]> and ]> Except for the extra "%doc" entry in the entity name table, these two document type declarations look to me to be exactly equivalent; as a matter of fact, I've always considered the second to be simply a short-hand for the first. James Clark made a convincing case for separating the inclusion of external general entities for the inclusion of external parameter entities. Can anyone make a convincing case for separating the inclusion of external parameter entities from the inclusion of the external DTD subset? Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Mar 12 18:09:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:57 2004 Subject: Basic Question In-Reply-To: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> References: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <14057.22147.177213.132967@localhost.localdomain> Dan Rudman writes: [on XML's well-formedness constraints] > If this is the case, how can I deal with the fact that most HTML > documents are NOT well-formed and that most HTML design tools do > not enforce, require, or even sometimes support, well-formedness in > a document? You'd best keep the two separate. Try including the following in the HTML: Now, the HTML can stay as it is, and the XML can be properly well-formed. This approach is already best practice for including CSS stylesheets (using ) and ECMA scripts (using