tag is the title (the major exception is the NAA "adex" DTD for classifieds, where <ad-slug> is the closest thing to a title). HTML has some conventions for search-engine metadata (title, description, keywords, robots). With XML, the administrator needs to map each DTD to these elements -- more work, more chance for error. And if the data is not there, not in a parsable format, or in a separate metadata file, the search engine is handicapped. I expect to ship our next release pre-configured for NITF, but I sure would like to see some common practice beyond <title>. Mostly, our customers would appreciate it, and the people doing searches would get better results. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Mon Apr 12 19:19:49 1999 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:11:16 2004 Subject: Parsing unparsed external entities (was: Standalone documents ...) In-Reply-To: John Cowan's message of "Mon, 12 Apr 1999 12:46:52 -0400" References: <37120556.597B67FC@iol.ie> <3712237C.18BE9009@locke.ccil.org> Message-ID: <usoa5u5ab.fsf_-_@lanber.ansa.co.uk> John> John Cowan <URL:mailto:cowan@locke.ccil.org> 0> In article <3712237C.18BE9009@locke.ccil.org>, 0> John wrote: John> Then you need an application framework capable of recursively John> parsing unparsed entities using XML notation, which AFAIK does John> not yet exist. This is what (sgml-parse) is in DSSSL for. Beware, though, that if the XML concrete syntax is not your default[1], you need to make sure the system identifier includes it. In Jade, this is done with something like (string-append "<osfile>xml.decl" filename), where filename is the system identifier of the external entity - I don't trust myself to get the syntax right for finding that! (but it can be done) [1] i.e. if you write your command lines as [tool] xml.decl mydoc.xml If you're not tied to XML, you might want to use SGML and SUBDOC instead (but I'm not sure how that's supported in the tools). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jelks at jelks.nu Mon Apr 12 20:01:58 1999 From: jelks at jelks.nu (Jelks Cabaniss) Date: Mon Jun 7 17:11:16 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) In-Reply-To: <370A9AE7.BCE60202@w3.org> Message-ID: <NBBBICMNIPCICMKJECCBKEMMCKAA.jelks@jelks.nu> Chris Lilley wrote: > > > My feeling is that there are three classes of implementation, that > > > should all have names: > > > > > > minimal well-formed - never tries to follow external entities > > > full well-formed - always tries to follow external entities > > > full validating - always tries to follow external entities and validates > > Agreed. ... > > > and it should be possible to always derive what class of implementation > > > a particular instance requires. > You don't comment on that sentence, so does it mean you agree? Yes. But see below. > > If there is to be a way to *force* validity by specifying it in the document > > instance, the only way I can see is by amending the spec with > > something like (as I believe you yourself suggested in passing) > > valid="yes" in the declaration. > > Right. With a default of "no", of course. So, this would make the > assertion that the document was valid and that assertions could be > tested and perhaps refuted, by a validating parser. In the case of > "valid="no" or perhaps, valid="wf", a validating parser would do what - > declare the document invalid? Agree, yes, its invalid (so why check it)? > Automatically use a non-validating mode, even if it was normally > validating? > Next question, should there be (in other words, is this something that > should be in the document instance). Yes. But how to do it? If XML 1.1 has a "valid='yes'|'no'" in the declaration, XML 1.1 documents may break when running under an XML 1.0 parser, since the XML 1.0 BNF clearly states what can and can't be in the declaration. Maybe a PI could be formalized similar to the way the stylesheet linking is being done: <?xml-assert implementation="valid"?> (could also be "minimal" or "full" for the well-formed only options you mentioned). /Jelks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Apr 12 22:13:59 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054B25@SOHOS002> Paul wrote: > Excelon implements a DOM interface to XML documents, not to > arbitrary data objects. That doesn't make it a bad product, > but I don't think it is the product Mark is describing. I imagine that > the (imaginary) product that Mark is describing would allow you > to specify your objects in IDL, manipulate them as ordinary > object/method/property Java or C++ objects and get a DOM interface > to them "for free" when you want it. I don't think that that product > exists. That's true, Paul, thanks. I was also imagining that: - when I want the last node from a tree that contains 100,000 nodes that the whole 'document' would not be read into memory. - that I could access the tree as if it was a complete DOM with all the caching and so on being done for me. - that if I perform an XSL-type query I will get the nodes I want, regardless of whether they are in memory or not. I have implemented a very crude version of this. I use the IE5 DOM and with this I retrieve documents from our database using URLs that are a scaled down version of XQL (I can't say I like XPointer). For example: http://[server]/documents/article[@author='Mark']/article.xml would retrieve all 'article' objects with an author attribute of 'Mark', that are children of a node of type 'documents'. This would then be returned to the caller as an XML document, but with a stylesheet PI pointing to 'stylesheets/article.xsl'. (Replacing .xml with .htm would yield the same results but the XML and XSL would be combined for you on the server.) The problem with this is that I have to convert this request to a query on the objects in the hierarchical database in order to populate my DOM. Of course, once in the DOM I can export it as XML or transform it if necessary, so the database does look from the outside like it is one great big XML document. But although I am quite happy with this so far, I can see that you would have to code this up for every type of database, and really it should be a job for the DOM. It really needs a layer like the layer above the database-specific layers in ODBC; it would sit just below the DOM. This layer would obviously need to understand schemas, so it wouldn't be a trivial task to implement. Anyway, my original question was 'is anyone doing anything like this?' and I think the answer is 'nowhere near yet!' Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Mon Apr 12 22:49:11 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054B25@SOHOS002> Message-ID: <000501be8525$a9605720$311fcdcd@total.net> Hi Mark, <Comment> I have implemented a very crude version of this. I use the IE5 DOM and with this I retrieve documents from our database using URLs that are a scaled down version of XQL (I can't say I like XPointer). For example: http://[server]/documents/article[@author='Mark']/article.xml would retrieve all 'article' objects with an author attribute of 'Mark', that are children of a node of type 'documents'. This would then be returned to the caller as an XML document, but with a stylesheet PI pointing to 'stylesheets/article.xsl'. (Replacing .xml with .htm would yield the same results but the XML and XSL would be combined for you on the server.) The problem with this is that I have to convert this request to a query on the objects in the hierarchical database in order to populate my DOM. Of course, once in the DOM I can export it as XML or transform it if necessary, so the database does look from the outside like it is one great big XML document. But although I am quite happy with this so far, I can see that you would have to code this up for every type of database, and really it should be a job for the DOM. It really needs a layer like the layer above the database-specific layers in ODBC; it would sit just below the DOM. This layer would obviously need to understand schemas, so it wouldn't be a trivial task to implement. Anyway, my original question was 'is anyone doing anything like this?' and I think the answer is 'nowhere near yet!' </Comment> <reply> This is an interesting request. Do you want us to explore a bit further your need? a) if you got a DOM interface on a RDB, would this be useful? b) if you would have a ODB with a DOM interface and that the ODB just maintain some virtual memory pages in memory. (i.e. the whole DOM is not in memory at once, only some pages are) Would this be useful? Thanks Mark for your collaboration Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Apr 13 00:13:24 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL References: <000001be848d$19371cd0$1b19da18@ne.mediaone.net> Message-ID: <37117285.689AFEF1@prescod.net> Jonathan Borden wrote: > > Actually, if you take my XMOP project which serializes COM/IDL described > and Java objects into a DOM interface, and bolt it onto eXcelon, this is > pretty much exactly what this would do. XMOP uses either Java reflection or > COM typelibraries (which are compiled IDL and are close but not quite full > fidelity to MIDL itself), and serializes the object into either a DOM or an > XML stream. I think that all such approaches fall apart quickly when you want the DOM to be writable. And if the DOM is *not* writable then I see it as only an optimization for generating XML and then building a DOM for that XML. And even so there are big efficiency issues. If the system doesn't do an XQL->SQL conversion then searching for anything will be hideously slow because you won't be using the native query optimizer. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Tue Apr 13 00:44:53 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:16 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) In-Reply-To: <370F9F35.2A0E63E0@w3.org> Message-ID: <001b01be8536$66df9710$0100007f@eps.inso.com> > Can you post some URIs? Are you willing to share them? I would trust > your servlets to be doing the right thing. I can probably release these. I'll check. I also have a few other bits of code that I'm trying to release. > > I still dislike the encoding information in the PI.... > > (it isn't, in theory, a PI although it looks exactly like one) I am of > quite the opposite point of view - I think that it finally > gives authors the ability to correctly label their documents. Right. My opinion though is that is does the right thing in the wrong place. > The same is true of any label. The encoding declaration in the XML > declaration at least always travels with the document, which is always > handy for ensuring metadata doesn't get lost. Right. The problem is really one of *metadata* not *data*, that is precisely my point. The *.mim proposal provided an *explicit* separation of the two. In retrospect, I must say that *.mim us also woefully insufficient... but that we still need, in some form, a way of encoding, and transporting, in an interoperable manner, the information (metadata) that is needed by *processors* of the data. > But if you are transcoding, you have to fix it anyway - so? Right, but a) You have to fix it by parsing a peice of arbitrary syntax, which proxies etc. will most likely not do, for performance reasons. b) The XML declaration is part of the *document* as specified by the XML 1.0 recommendation, changing the XML declaration changes the *document*, which is a Bad Thing(tm). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Tue Apr 13 01:01:21 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:11:16 2004 Subject: Last Call for the XML Fragment Interchange Rec Message-ID: <3.0.32.19990412180037.00f308e4@pophost.arbortext.com> * Document to review: http://www.w3.org/TR/WD-xml-fragment-19990412 * Last call ends: 1999 April 23 * Send comments to: mailto:www-xml-fragment-comments@w3.org The XML Fragment WG [1] has just published its Final Working Draft of the XML Fragment Interchange Recommendation [2]. A Last Call period starts now and runs until April 23. Its abstract reads: The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient. The XML Fragment WG is chartered with defining a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. This document defines Version 1.0 of the [eventual] W3C Recommendation that addresses this issue. Comments are solicited from all W3C WGs and the public at this time. As indicated in the document, comments should be sent to [3], (a publicly archived list). Comments received by 1999 April 23 will be considered for the Proposed Recommendation version. All comments from W3C working groups and from recognized liaison groups will be considered in light of the XML Fragment Requirements Document [4]. In particular, basic scope issues and design decisions will be reconsidered only when grave and previously unrecognized flaws are uncovered. Requests for enhancement will typically be deferred for later versions of the specification under development unless the enhancement is uncontroversial and its incorporation would not materially delay production of the specification. Paul Grosso XML Fragment WG Chair Daniel Veillard W3C Staff Contact [1] http://www.w3.org/XML/Activity.html#fragment-wg [2] http://www.w3.org/TR/WD-xml-fragment-19990412 [3] mailto:www-xml-fragment-comments@w3.org [4] http://www.w3.org/TR/NOTE-XML-FRAG-REQ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Apr 13 02:57:09 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:16 2004 Subject: problem with IE5 References: <21172.199904121418@doyle.cogsci.ed.ac.uk> Message-ID: <371295E9.B36ABC78@Eng.Sun.COM> Richard Tobin wrote: > > > Looks to me like: > > > (b) IE5 however REQUIRES conformance to the namespace spec, > > and thus rejects some well formed XML 1.0 documents, > > such as Richard's original; > > In what way does my document (below) not conform to the namespace > recommendation? My goof ... some other examples I tried get rejected however, including ":some:long:xml_1.0:names". I'm seeking an accurate description of the syntax that IE5 supports, and "XML 1.0" doesn't seem to be it ... neither does "XML 1.0 but requiring XML namespaces". > It contains no qualified names in the body, and > prefixes in the DTD are not required (and not able) to be declared. Prefixes in the DTD can be declared, but in this case they weren't ... more to the point, they didn't need to be!! > > > <?xml version="1.0"?> > > > <!DOCTYPE test [ > > > <!ELEMENT test ANY> > > > <!ELEMENT foo:bar ANY> > > > ]> > > > <test/> - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Tue Apr 13 03:02:59 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:11:16 2004 Subject: Standalone documents as external parsed entities. References: <37120556.597B67FC@iol.ie> <00ce01be8504$3cbdfd00$0300000a@cygnus.uwa.edu.au> Message-ID: <371299F0.E53840F2@iol.ie> James Tauber wrote: > > There are ways around it. But for starters, be careful with the term > "standalone" as it means something quite specific in XML (and something > different from what I'm guessing you mean by it). Thanks James. I actually had already started down the path of option number two that you suggested, i.e. using "wrapper" documents, with DTDs, to refer to external entities, w/o DTDs, that contain the actual document. I need a DTD on these documents because I need to constrain their structure. I was hoping there was a better way, since this doubles the number of documents I have to manage, but it appears there isn't. I consider this to be a shortcoming of XML, in that it is not "orthogonal", i.e. I have to write my documents in one of two different ways, depending on how they're going to be used. A better solution, I believe, would be to take a more "object-oriented" approach, i.e. that each document is responsible for it's own validity, through the use of its own DTD. This would require a parser that could handle recursively nested documents, each with their own DTD. Although I could adopt such a non-standard solution here in my own environment, and produce HTML for publication, I couldn't publish the XML/XSL, since the documents would be non-standard and unreadable by anyone else. I keep hearing that XML is a "data" language, as opposed to a "document" language, but I think that this is one case where XML breaks widely accepted data modelling norms, i.e. type encapsulation. Thanks all, Alan. P.S. James, after I sent that mail, I realised that my documents are not actually standalone (in XML terms), since they refer to an external DTD, so I used the term incorrectly. But, then I realised they could actually be standalone, by making all of the necessary declarations in the internal subset, and the problem would still be the same. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 13 03:59:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:16 2004 Subject: Megginson and XMLNews In-Reply-To: <3.0.5.32.19990412100407.00bfd4a0@corp> References: <370FB644.6F92D093@w3.org> <3.0.5.32.19990407163917.00bd2a60@corp> <14095.52616.339600.100498@localhost.localdomain> <3.0.5.32.19990412100407.00bfd4a0@corp> Message-ID: <14098.40792.709506.370200@localhost.localdomain> Walter Underwood writes: > I expect to ship our next release pre-configured for NITF, That's wonderful. > but I sure would like to see some common practice beyond <title>. > Mostly, our customers would appreciate it, and the people doing > searches would get better results. Actually, I think that you need something a little more robust -- otherwise, we'll end up with a hodge-podge of rules for what element names people can and cannot use. I would not want to forbid someone from using something like this: <?xml version="1.0"?> <person> <title>Dr.

From b.laforge at jxml.com Thu Apr 1 00:07:38 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:10:53 2004 Subject: XML-to-Java, and Java-to-XML Message-ID: <003501be7bc3$a2c967a0$c8a8a8c0@thing1> From: Raghunandan Havaldar >'lookup' - provides lookup of XML nodes in a DOM-based > tree. MDSAX supports id/idref lookup within a document by looking for bean properties on a mapped component which have the same name as the idref attribute. The value assigned to the property is the mapped component which had the id attribute. There is also a facility for resolving a reference to a mapped component in a second document, again assigning that value to a property of a component from the first document, with full support for circular references. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Apr 1 00:14:13 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:53 2004 Subject: XML query language Message-ID: <01f901be7bc2$bf6aec40$0b2e249b@fileroom.Synapse> Ingo Macherius wrote: > An XQL query may return numbers, strings, Date objects or even user > defined data types, which are not nodes in the DOM sense, but > objects. Actually a NodeList as defined by the DOM is composed of Nodes. Nodes are typed, one of which is TEXT_NODE. A text node can hold a number, string, date etc (numbers and dates have no defined meaning in XML 1.0). Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhavaldar at str.com Thu Apr 1 00:57:27 1999 From: rhavaldar at str.com (Raghunandan Havaldar) Date: Mon Jun 7 17:10:53 2004 Subject: Linking elements in a document Message-ID: <013c01be7bc9$c4b21280$612a96d0@raghu.STR_MILW> Hi, Is there a 'better' way to link elements in a XML document ?. - XLL is one option. I looked up the 1.0 specs, and got the feeling that using ID and IDREFS in the DTDs is one simple method to do this. I was wondering if i could do: . . . LINK to user1 LINK to user2 Am looking for an easy method to do this - so that I can access the users through a DOM tree by using the 'link'. Any possibilities ?. thanks raghu Raghu Havaldar Consultant rhavaldar@str.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From thbzcrt at worldnet.fr Thu Apr 1 01:17:22 1999 From: thbzcrt at worldnet.fr (Thierry Bezecourt) Date: Mon Jun 7 17:10:54 2004 Subject: Fw: XML query language and another OS/XML suggestion In-Reply-To: "Stephen D. Williams"'s message of "Tue, 30 Mar 1999 13:56:31 -0500" References: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il> <37011E5F.A22DF23B@lig.net> Message-ID: "Stephen D. Williams" writes: > A note on the /dev/proc/xml mention: I've been thinking for a while > that EVERY data/meta-data interface to a typical OS (such as > Linux/Unix) should have an XML form. Maybe add or override -X or > --XML to all commands where it could possibly make sense. ps, > netstat, lsof, ifconfig, df, egrep, ls, etc. are all good > candidates. Add simple tree/value extraction to bash and you'd have > more portability for a lot of things. You may be interested by the LinuXML project. They already have an XML version of ls (I have not tried it). http://www.ozemail.com.au/~birchb/linuxml/linuxml.htm However, subscribing to their mailing list is not easy. I tried, and gave up. Thierry Bezecourt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Thu Apr 1 02:33:28 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:10:54 2004 Subject: Fw: XML query language and another OS/XML suggestion Message-ID: <199904010158.RAA08800@0000000.com> A version of ls that outputs xml would be useful for some tasks. However, it appears that the xmls project you mentioned is just the standard BSD unix tools distribution. An xml encoder/decoder en miniature would be really helpful for this kind of task. (A parser with some understanding of the context around it). I'm working on such a beast now. I spent some time looking at other source code distributions for parsing XML and none of them seemed suitable. So I'll let y'all know how mine turns out. Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 1 05:47:07 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:54 2004 Subject: Fwd: RE: Why Doesn't IE5 use the DTD to Validate? Message-ID: <199904010346.WAA04960@hesketh.net> This was on XSL. I think its implications may be more interesting on this list. Personally, I'm appalled, but I guess I shouldn't expect anything different. >From: Jonathan Marsh >To: "'xsl-list@mulberrytech.com'" >Subject: RE: Why Doesn't IE5 use the DTD to Validate? >Date: Wed, 31 Mar 1999 14:02:31 -0800 >X-Mailer: Internet Mail Service (5.5.2524.0) >Sender: owner-xsl-list@mulberrytech.com >Reply-To: xsl-list@mulberrytech.com > >This is as designed, not a bug. > >The IE5 XML parser is a validating parser, with two properties set through >DOM extensions to control DTD handling: > - validateOnParse determines whether validation errors are presented to the >user. > - resolveExternals determines whether the DTD or XML Schema is loaded and >datatypes, default values, etc. are honored. > >The values of these properties when browsing directly to XML documents is >validateOnParse=false and resolveExternals=true. > >When browsing XML documents on the Web, surfacing validation errors is of >little apparent value. I would not expect publishers to author both a DTD >or XML Schema and documents that don't conform to that DTD/Schema. So the >vast majority will not generate validation errors. For those that declare a >DTD and are invalid, is it no better to give the user a validation error >instead of displaying the document, in fact the validation error could >prevent the user from viewing an otherwise perfectly good document. Also >the performance penalty for validation is significant and should not be >imposed on end-users without good reason. > >The only scenario we could come up with where validation is useful when >browsing XML documents is when the browser is used as a development tool, >allowing easy checking of well-formedness and validation for a document in >progress. This scenario can be accomplished by a number of alternative >mechanisms without impacting the browsing experience - a simple tool that >validates an XML document could be written in a few lines of JavaScript, see >http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/defau >lt.asp for an example. > >We considered several mechanisms for allowing developers to "turn on" >validation errors but did not find a clean solution that could be >implemented in time for the IE5 release. > >- Jonathan Marsh > >> -----Original Message----- >> From: Sall, Ken [mailto:ksall@cen.com] >> Sent: Wednesday, March 31, 1999 6:37 AM >> To: 'xsl-list@mulberrytech.com' >> Subject: RE: Why Doesn't IE5 use the DTD to Validate? >> >> >> Thanks, Stephen. >> I've added an example that illustrates your point that IE5 detects DTD >> syntax errors. >> >> http://members.home.com/kensall/tests/collection1bugsdtd.xml >> http://members.home.com/kensall/tests/collection1bugs.dtd >> >> However, if anyone from Microsoft can explain why IE5 doesn't >> actually use >> the DTD to validate the document (the way that IE5 Beta 2 did), I'd >> appreciate it. This problem will be published in an article >> shortly (in the >> larger context of positive things you can do with IE5 with >> XML/XSL) and it >> would be great to state correctly what Microsoft plans w.r.t. DTD >> processing. >> >> TIA >> - Ken Sall ksall@cen.com, kensall@home.com >> - Century Computing, Inc. http://www.cen.com/ >> - NG-HTML: Next Generation HTML http://www.cen.com/ng-html/ >> - XML at Web Developers Virtual Lib >> http://WDVL.com/Authoring/Languages/XML/ >> - MW3: Motif on the World Wide Web http://www.cen.com/mw3/ >> >> > -----Original Message----- >> > From: Stephen Ransom [mailto:sransom@objectmastery.com] >> > Sent: Wednesday, March 31, 1999 1:52 AM >> > To: xsl-list@mulberrytech.com >> > Subject: Re: Why Doesn't IE5 use the DTD to Validate? >> > >> > >> > > It doesn't appear that IE5 (March 18th release) uses the >> > DTD to validate >> > > XML, as did the IE5 Beta 2 release. Has anyone been able to >> > make IE5 detect >> > > when a doc doesn't follow the rules of the DTD that it references? >> > >> > I agree that IE5 appears to "lose" the errors in a well >> > formed but invalid XML >> > document (ie one written in proper XML but which fails to >> meet its DTD >> > definition). >> > >> > I note however that IE5 is aware of the DTD even though it >> > will pass through a >> > failing XML document. This can be shown by adding a line of >> > XXXX's into the DTD >> > itself (thus breaking the DTD's well-formedness). IE5 will >> > give you an error >> > message identifying the XXXX's as incorrect. >> > >> > Stephen >> > >> > >> > >> > XSL-List info and archive: >> http://www.mulberrytech.com/xsl/xsl-list >> > >> >> >> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list >> > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Apr 1 07:31:57 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:54 2004 Subject: Why Doesn't IE5 use the DTD to Validate? References: <000301be7bf1$54d91440$31aa97cc@server.total.net> Message-ID: <3703049D.4AE3A3CD@allette.com.au> Simon St Laurent moved this discussion to this list from XSL, so I followed. Didier PH Martin wrote: > you are right it parse the DTD from a syntactic point of view but do not > enforce the structural integrity of the document. It is faster to just parse > the DTD syntax than to enforce structural integrity. To clarify, it's faster to process the DTD but not the instance than it is to process both, but it may be a marginal difference. Imagine using the DocBook DTD for the following instance: Baby Snakes. Surely validation of the instance is a fairly minor issue, after the processor has had to plow through a large DTD? I'm not suggesting that XML documents should be validated client side, I'm questioning the wisdom of looking at the DTD at all if you're not putting it to any use. Surely the overhead of looking at the DTD outweighs the benefit (none) obtained in rendering the above example? > It takes the principle > that it will try to render the document even if a structural error is > present. So, rendition takes over integrity of the structure. This is > because the browser main purpose is to render. No it doesn't, it determines that it will try to render the document regardless of whether it contains errors. I agree that for a browser, rendition should take precedence over structure. > However, when the same parser is used in a different context, structural > integrity may becomes a main constraint. Agreed - that's when I'd have the processor examine the DTD. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Apr 1 07:50:51 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:54 2004 Subject: Why Doesn't IE5 use the DTD to Validate? In-Reply-To: <3703049D.4AE3A3CD@allette.com.au> Message-ID: <000001be7c02$ac0cc750$1b19da18@ne.mediaone.net> Marcus Carr wrote: I'm not > suggesting that > XML documents should be validated client side, > I'm questioning > the wisdom of looking at the DTD at all if you're not putting it > to any use. > Surely the overhead of looking at the DTD outweighs the benefit > (none) obtained > in rendering the above example? > I also agree that browser validation is probably only helful to DTD writers as opposed to end-users and have no problem with leaving it off by default. The reason to parse the DTD is that enternal entities and default attributes are something which are very well needed client side... if entities were left unexpanded by default this would change the 'meaning' of the document itself, something which end users might be interested in :-)) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Thu Apr 1 08:43:22 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:10:54 2004 Subject: Programming practice In-Reply-To: Your message of "Mon, 29 Mar 1999 17:12:28 CST." <87256743.0079FE91.00@d53mta03h.boulder.ibm.com> Message-ID: <199802260258.TAA00767@malatesta.local> > >The good programming practice of replacing "magic numbers" with > descriptive > >constants is even older than the structured programming movement, and any > >programmer who writes > > But that's not really the point I don't think. The point isn't "if you are > as macho a programmer as me you don't need any help". This is a pretty silly representation of what I wrote. > The point is that we > work in a commercial environment and every single semantic that can be > expressed in the code itself, so that the compiler can tell you when break > them, is a Very Goode Thinge. It is, of course, a question of degree. A little help from the compiler is useful, but the compiler cannot hold a programmer's hand and make him adopt every common-sense good practice. I happen to believe that interface constants are simple enough to "get right" that it is unnecessary to introduce complexity and slow performance with such schemes as singleton object representations. > It does no good at all to have a named constant if you can accidentally > pass that named constant to 150 other things for which its not intended and > the compiler cannot catch it. Its a fundamental lacking in Java that makes > me shudder to think that people actually want to do serious work in it. I don't see the disaster you are pointing out: module Spam{ interface Egg{ const unsigned int SUNNY_SIDE_UP = 1; const unsigned int SCRAMBLED = 2; const unsigned int POACHED = 3; void process(in unsigned int processType); } interface Foo{ const unsigned int A = 1; const unsigned int B = 2; const unsigned int C = 3; void bar(in unsigned int param); } } So as I write the code, I simply use the proper constants for the proper interface. Spam.Egg.processEgg(Spam.Egg.SCRAMBLED); and later Spam.Foo.bar(Spam.Foo.B) No quantum chromodynamics there. Now why would I ever use a constant that was meant for the Egg interface in the context of Bar, even though Bar happens to have a constant of the same value? I wouldn't mix things up in even the above simple example, so it boggles my mind to think that anyone in their right mind would commit such folly 150 times. Is this the "machismo" to which you allude? I call it basic training, and no compiler or mechanism can prevent a project from the lack of same. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Apr 1 12:56:51 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:54 2004 Subject: DTDs are just for validation (Re: Why Doesn't IE5 use the DTD to Validate?) In-Reply-To: <3703049D.4AE3A3CD@allette.com.au> References: <000301be7bf1$54d91440$31aa97cc@server.total.net> <3703049D.4AE3A3CD@allette.com.au> Message-ID: <14083.20322.923520.236861@localhost.localdomain> Marcus Carr writes: > To clarify, it's faster to process the DTD but not the instance > than it is to process both, but it may be a marginal > difference. Imagine using the DocBook DTD for the following > instance: > > > > Baby Snakes. > > Surely validation of the instance is a fairly minor issue, after > the processor has had to plow through a large DTD? > I'm not suggesting that XML documents should be > validated client side, I'm questioning the wisdom > of looking at the DTD at all if you're not putting it to any use. > Surely the overhead of looking at the DTD outweighs the benefit > (none) obtained in rendering the above example? There *is* a potentially nasty problem lurking here: the DTD may contain default values for attributes as well as validation information. In the SGML version of DocBook, there is not a problem, but what if the new version of DocBook had something like this? I suspect that industry practice will be always to run XML through a normaliser before publishing, so that the attribute default values get plugged right into the instance. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bill at wadley.org Thu Apr 1 14:12:48 1999 From: bill at wadley.org (Bill Wadley) Date: Mon Jun 7 17:10:54 2004 Subject: narrowing a DTD Message-ID: Hello! EXECUTIVE SUMMARY: I would like to take a currently existing DTD (e.g. Scalable Vector Graphics, SVG), _overlay_ some different semantics onto it, and create a subset of the SVG DTD. DISCUSSION: I would like to create a SVG document that is 100% SVG with no new ELEMENTS, ENTITIES or anything. However, I'm creating this document from some other data structure that does not map 1:1 to SVG. I would like this *mapping* to be a *standard* in it's own right for use in a particular community. Essentially, I would like an SVG-like DTD, a pure subset with different semantics, that would be used to create the SVG-overlay document. The full SVG DTD can still be used to *extend* the SVG-overlay DTD, but when we go to parse the SVG-overlay document, we can ignore the extended SVG and just go for the SVG-overlay stuff knowing that the creator of the document used the SVG-overlay DTD and that the ELEMENTS and things mean what we think they mean. On the other hand, we can pass the SVG document and the full SVG DTD to an SVG drawing program, and it will have no problems just drawing the picture as usual. So that is what I want; question is, how do I do it? My immediate response is to take the SVG DTD and start removing things, but that's not enough. We need to use some of the CDATA attributes to hold specific information in a specific format; this would imply that we need to "override" declarations in the original DTD somehow. It may also be necessary to increase the restrictions on ELEMENT declarations. Nothing needs to be added or relaxed. (If this weren't not true, I would say all is lost, but this is a narrowing, not a broadening.) I'm trying to be concise, but if I'm not being clear, I'll be happy to try again. Any light shed on this problem will be greatly appreciated. Thanks! B-) -- Bill Wadley |GAT/d-(++) s++: a C++++ UL++++$ P++++$ L+++>++++ E- W+++$ | bill@wadley.org |N+++ w-- O-- M-- PS+ PE Y++ PGP++ t++ 5++ X- R+ tv b++++ D++| bill.wadley.org |G++ e* h--- r+++ y? bill.wadley.org/PGP_KEY.html | "The dinosaurs became extinct because they didn't have a space program." -Larry Niven xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 1 15:17:36 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:54 2004 Subject: DTDs are just for validation (Re: Why Doesn't IE5 use the DTD to Validate?) In-Reply-To: <14083.20322.923520.236861@localhost.localdomain> References: <3703049D.4AE3A3CD@allette.com.au> <000301be7bf1$54d91440$31aa97cc@server.total.net> <3703049D.4AE3A3CD@allette.com.au> Message-ID: <199904011317.IAA13781@hesketh.net> At 05:57 AM 4/1/99 -0500, David Megginson wrote: >I suspect that industry practice will be always to run XML through a >normaliser before publishing, so that the attribute default values get >plugged right into the instance. I suspect strongly that industry practice for XML will diverge as sharply as it did for HTML and SGML, leading to lots of 'practices' that render XML document sets mutually unintelligible to different processors. Why? Because too many people have wildly different assumptions about 'industry practice', but as long as they all have assumptions, things get left out of specs or implemented without concern for the spec. I'd list some culprits, but it seems too rude. (Namespaces, validation, and retrieval of external resources are the main areas for such entertainment, however.) Short version: There is no uniform industry practice with regard to XML processing, and it's not likely that there ever will be. If it needs to be hammered down, write it into the spec, ferociously. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Apr 1 16:12:38 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:10:54 2004 Subject: DTDs are just for validation (Re: Why Doesn't IE5 use theDTD to Validate?) References: <3703049D.4AE3A3CD@allette.com.au> <000301be7bf1$54d91440$31aa97cc@server.total.net> <3703049D.4AE3A3CD@allette.com.au> <199904011317.IAA13781@hesketh.net> Message-ID: <37037D60.D5B43626@manhattanproject.com> "Simon St.Laurent" wrote: > I suspect strongly that industry practice for XML will diverge as sharply > as it did for HTML and SGML, leading to lots of 'practices' that render XML > document sets mutually unintelligible to different processors. Since market differentation is how companies market their products, it is not in their best interest to follow a standard all that closely anyway. With the current licensing practices, this should not be new to anybody. It dosn't matter how good the standard is or how quicky it is written down. The market mechanism will force industry practice divergance, independent of the value provided. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 1 16:27:58 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:54 2004 Subject: DTDs are just for validation (Re: Why Doesn't IE5 use theDTD to Validate?) In-Reply-To: <37037D60.D5B43626@manhattanproject.com> References: <3703049D.4AE3A3CD@allette.com.au> <000301be7bf1$54d91440$31aa97cc@server.total.net> <3703049D.4AE3A3CD@allette.com.au> <199904011317.IAA13781@hesketh.net> Message-ID: <199904011426.JAA15311@hesketh.net> At 02:06 PM 4/1/99 +0000, Clark Evans wrote: >Since market differentation is how companies >market their products, it is not in their best >interest to follow a standard all that closely >anyway. With the current licensing practices, >this should not be new to anybody. It dosn't >matter how good the standard is or how quicky >it is written down. The market mechanism >will force industry practice divergance, >independent of the value provided. I think we've all heard this line before, and I think it's time to stop this train before it runs over anyone else. If I hear the word 'innovation' one more time in defense of ploys that serve the companies using them but wreak havoc on a computing community as a whole, I think I'm going to puke. Too bad the W3C has no teeth. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hassan.hussein at zurich.com Thu Apr 1 16:38:41 1999 From: hassan.hussein at zurich.com (hassan.hussein@zurich.com) Date: Mon Jun 7 17:10:54 2004 Subject: DOM Message-ID: Forgive me for asking these which may seem obvious to you! I have been trying to get involved with XML/Java world for the past 3 months. I would like to be clear about few topics. What exactly does DOM do and how does it work with XML parsers? What exactly does SAX do and how does it work with XML parsers? What exactly does SAXON do and how does it work with XML parsers? What is the different between the above three tools? Does a parser have to have DOM in order to work? Does a parser have to have SAX in order to work? What do you use in order to create an XML document programmatically? Please help me get started. Thanks Hassan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Apr 1 16:54:16 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:55 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <044601be7c4e$7eb8b4c0$0b2e249b@fileroom.Synapse> David Megginson wrote: > >There *is* a potentially nasty problem lurking here: the DTD may >contain default values for attributes as well as validation >information. If DTDs *were* only for validation there would be no issue here. However DTDs provide additional functionality beyond validation, namely default attributes and entities. The problem exists in that XML parsers can *choose* whether or not to validate and in so doing the information content of the XML document is altered. Validation is optional. Says so. Given this, the question becomes: ought parsers be allowed to expand entities and default attributes with validation turned off? What problem does this create? Perhaps the XML spec should properly specify that: *if* a DOCTYPE declaration is present which specifies a DTD then the document must be validated else the parser must generate an error. (DOCTYPE declarations would remain optional). In this way document authors would be able to properly specify information content. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 1 17:21:43 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:10:55 2004 Subject: DOM Message-ID: <93CB64052F94D211BC5D0010A80013310EB3E3@WWMESS3.172.19.125.2> > What exactly does DOM do and how does it work with XML parsers? The DOM is an interface that allows an application to discover information about an XML document by walking around it, navigationally. Many XML parsers implement the DOM interface. > What exactly does SAX do and how does it work with XML parsers? SAX is an interface that allows an application to discover information about an XML document by receiving notification of events as the document is serially read (for example, start and end of elements). Many XML parsers implement the SAX interface. > What exactly does SAXON do and how does it work with XML parsers? SAXON is a Java class library that provides high-level application functions on top of DOM or SAX, for example it allows you to select the elements to be processed using XSL-compatible patterns (queries). SAXON also includes an XSL processor. > What is the different between the above three tools? One important difference is that SAXON is a "product", SAX and DOM are interfaces. There are many parsers that implement SAX and/or DOM interfaces. SAX is a "lower-level" interface than DOM, it is more work for the application and less for the parser, but for some applications it uses less resources. > Does a parser have to have DOM in order to work? No. > Does a parser have to have SAX in order to work? No. > What do you use in order to create an XML document programmatically? System.out.println() > Please help me get started. You're welcome. Mike Kay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990401/212dd57c/attachment.htm From paul at prescod.net Thu Apr 1 17:24:12 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:55 2004 Subject: Is validity an option? References: <7847B57C7C96D2119DBE00A0C96F64B6206DAE@cen1.cen.com> Message-ID: <37038A14.1FF1AC0A@prescod.net> In the XSL-List, Ken Sall quoted Tim Bray: > > http://www.xml.com/axml/testaxml.htm > > Tim Bray's annotations: > > "Validity Is Not An Option > > XML evangelists, such as myself, take great glee in pointing out that > XML, unlike SGML, has no optional features; the result, we claim > triumphantly, is that any XML processor in the world should be > able to read any XML document in the world (well, modulo character > encoding issues). > > "Aha!" claim some ungrateful doubting Thomases; "XML > distinguishes well-formedness and validity, and that's an option!" > > Wrong. Anything that's well-formed is an XML document, > and any XML processor has to be able to read any well-formed > document. If a document wants to aspire to the higher karmic plane > of validity, well good on it, but that's an extra, not an optional > feature of XML." First, validity is an optional feature of *parsers*. I believe this to be self-evident. I have heard the "no optional features" statement interpreted in three different ways -- obviously Tim chooses to interpret it in a way that allows XML to have none. Second, the XML specification is quite clear about the fact that different XML processors can legally produce different parse trees for the same data. Heck, they can produce a different parse tree depending on the day of the month. "The behavior of a validating XML processor is highly predictable; it must read every piece of a document and report all well-formedness and validity violations. Less is required of a non-validating processor; it need not read any part of the document other than the document entity." To be perfectly honest I am a lot more comfortable with the SGML-world's model: some documents are not processable by some parsers but if the parser says it can handle it then you always know what you are getting out. Perhaps it isn't too late -- maybe the information set group could fix this flaw. After all, they are in the business of ensuring conformance of processors so the next step would be to rigorously specify conformance classes: "validating", "external entity fetching non-validating", "non-external entity fetching non-validating." Handling three classes (three optional features!) is a hassle but the current situation is that the parser can decide what it wants to do about external entities all by itself! -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Other Operating Environments Will Have Trouble Keeping up with Linux's Growth" - http://www.idc.com/Data/Software/content/SW033199PR.htm International Data Corporation bulletin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Apr 1 17:25:16 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:10:55 2004 Subject: XML query language Message-ID: Paul Prescod wrote: > Mark Birbeck wrote: > > > > Paul Prescod wrote: > > > And that model has a concept of nodelist -- this is the most > > > appropriate return value for query results. > > > > What do you mean by nodelist? Does it take into account that result > > nodes may be returned from different parts of the tree, or even at > > different depths? > > Sure. A node list is a list of nodes. No more, no less. I sort of guessed it might be ;-) I was more getting at the idea of context. The following is a 'list of nodes': Mark Tracey Jan But we don't know were they came from. Even if we know what query generated them, we don't know what depth they came from. If we used the query: //[name='Mark'] we might get: Mark Mark But the original source might be: Tracey Mark Mark The reason I was suggesting the fragment approach is because it has within it the notion of context, and it contains a reference to the actual 'query' - or reference - that yielded those results. We might return something like: Tracey Mark Mark I'm not saying it's ideal for all situations. I'm just interested to see how context can be encoded in a 'list of nodes'. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Apr 1 17:38:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:55 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation In-Reply-To: <044601be7c4e$7eb8b4c0$0b2e249b@fileroom.Synapse> References: <044601be7c4e$7eb8b4c0$0b2e249b@fileroom.Synapse> Message-ID: <14083.37530.889640.171000@localhost.localdomain> Jonathan Borden writes: > David Megginson wrote: > > > >There *is* a potentially nasty problem lurking here: the DTD may > >contain default values for attributes as well as validation > >information. > > If DTDs *were* only for validation... As was probably clear from the rest of my message, the subject line was meant to read "DTDs are not just for validation". All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Thu Apr 1 17:50:35 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:10:55 2004 Subject: Why Doesn't IE5 use the DTD to Validate? References: <199904010346.WAA04960@hesketh.net> Message-ID: <3703957E.A5D15F7@thinlink.com> > >From: Jonathan Marsh > ... > >This is as designed, not a bug. > ... > >The only scenario we could come up with where validation is useful when > >browsing XML documents is when the browser is used as a development tool, What about the scenario where a document claims to conform to some well-known DTD? This viewpoint seems to show a pretty narrow view of the ways in which XML might be employed. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Apr 1 17:59:22 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:55 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation In-Reply-To: <044601be7c4e$7eb8b4c0$0b2e249b@fileroom.Synapse> Message-ID: <001f01be7c57$aad46dc0$31aa97cc@server.total.net> HI Jonathan, If DTDs *were* only for validation there would be no issue here. However DTDs provide additional functionality beyond validation, namely default attributes and entities. The problem exists in that XML parsers can *choose* whether or not to validate and in so doing the information content of the XML document is altered. Validation is optional. Says so. Given this, the question becomes: ought parsers be allowed to expand entities and default attributes with validation turned off? What problem does this create? Perhaps the XML spec should properly specify that: *if* a DOCTYPE declaration is present which specifies a DTD then the document must be validated else the parser must generate an error. (DOCTYPE declarations would remain optional). In this way document authors would be able to properly specify information content. Thanks for bringing back the issue at its source: the spec. According to the spec nothing is said about how to interpret a document. It just say how a document is to formatted but not how it is to be interpreted. Now that real stuff is going out we see that holes are in the architecture. The holes being: what do we do with this? this question is dependent on type of interpreters like: a) browsers b) ERP front ends and back ends c) repositories d) any other stuff I am not think of right now there is no specs on how you do interpret or parse a document in the context of a browser. Your suggestion is a constructive one. You propose that the next spec version reduces the ambiguity on the parsing stage by including in the specs the parsing rule. the specs should also reduces the ambiguity with external references, so, to speak, to explicitly state if a parser should consider the presence of a DTD as a signal to validate the document. Actually it is leaved at the mercy of the implementer and no specifications are available to dictate the rules of conduct. Thanks Jonathan for a constructive comment. Any other constructive opinion? I mean here, any suggestions concerning the rules or more specifically the specs? Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Apr 1 18:00:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:55 2004 Subject: Is validity an option? In-Reply-To: <37038A14.1FF1AC0A@prescod.net> References: <7847B57C7C96D2119DBE00A0C96F64B6206DAE@cen1.cen.com> <37038A14.1FF1AC0A@prescod.net> Message-ID: <14083.38518.955372.918892@localhost.localdomain> Paul Prescod writes: > In the XSL-List, Ken Sall quoted Tim Bray: > > XML evangelists, such as myself, take great glee in pointing out that > > XML, unlike SGML, has no optional features; the result, we claim > > triumphantly, is that any XML processor in the world should be > > able to read any XML document in the world (well, modulo character > > encoding issues). > Second, the XML specification is quite clear about the fact that > different XML processors can legally produce different parse trees > for the same data. Heck, they can produce a different parse tree > depending on the day of the month. Paul's right. There are no options in terms of producing a boolean value (well-formed/not well-formed), but there are very annoying options in terms of what information the parser is allowed to ignore (such as external entities, the external DTD subset, and by extension, and entity and attribute declarations in the external DTD subset). > To be perfectly honest I am a lot more comfortable with the > SGML-world's model: some documents are not processable by some > parsers but if the parser says it can handle it then you always > know what you are getting out. No, actually, if the parser says that it can handle the SGML declaration that it happens to have read from some random place on your system, then you know that if your document happens to match that SGML declaration you'll get out what you expect. That model sucks too (even if it looked good on paper). > Perhaps it isn't too late -- maybe the information set group could > fix this flaw. After all, they are in the business of ensuring > conformance of processors so the next step would be to rigorously > specify conformance classes: "validating", "external entity > fetching non-validating", "non-external entity fetching > non-validating." The Infoset WG does not intend to rewrite XML 1.0 or redefine XML-conformance. SAX2, on the other hand, can take a stab classifying its parsers (as could the DOM). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Apr 1 18:16:46 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:56 2004 Subject: XML query languages and their encodings References: Message-ID: <370395C7.F93481AB@prescod.net> Mark Birbeck wrote: > > I sort of guessed it might be ;-) I was more getting at the idea of > context. The following is a 'list of nodes': > > Mark > Tracey > Jan That's exactly my point. That's not a list of nodes. That's a list of XML elements. Nodes are abstract. Here's a concrete representation for them (and a containing element) for discussion purposes: x= element( gi: "names", content: element( gi: "name", content: text( "Mark")) element( gi: "name", content: text( "Tracey")) element( gi: "name", content: text( "Jan")) ) Now in this abstract model a "list of nodes" is: [x.content[0], x.content[1], x] Do I know their context? Yes. Do I know their depth? Can I talk about nodes of different depths? Yes. In this brain-dead simple abstract model those issues are not complex at all. Now if we want to encode these results for transmission between machines then all of the issues you raise are important. But that is a *separate issue*. It has nothing to do with the abstract concept of "node list". "XML People" are encoding-focused so they always come back to the encoding. That's fine but it is also important to recognize that some things should be considered in the abstract domain -- like the result sets of query languages. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Other Operating Environments Will Have Trouble Keeping up with Linux's Growth" - http://www.idc.com/Data/Software/content/SW033199PR.htm International Data Corporation bulletin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Apr 1 18:24:33 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:10:56 2004 Subject: Re attachments Message-ID: <85256746.0059D883.00@D51MTA03.pok.ibm.com> Just a thought: How Hard Would It Be, and how unreasonable would it be, to modify the listserver so attachments of any sort are automatically discarded? It's unclear that they're ever appropriate for a mailing list... ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Apr 1 18:35:40 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:10:56 2004 Subject: XML query language Message-ID: <85256746.005AF125.00@D51MTA03.pok.ibm.com> >Sure. A node list is a list of nodes. No more, no less. Actually, a DOM NodeList is something more: it is a dynamic filtered view of the document, a list of nodes WHICH CHANGES AS THE DOCUMENT IS EDITED so that it always represents the results as if the query had just been issued. If you (re)move a node from the tree, the NodeList changes too... and the integer indices no longer refer to the same nodes they did a moment ago. This is, if you'll excuse my French, a bitch-kitty to implement; you need either DOM Level 2 event handling or a lightweight version thereof. It's also, in my experience so far, a serious risk of performance problems and/or logic errors unless your document is static or nearly so. Caveat usetor; know what you're getting into before you request one. And consider whether the DOM Level 2 iterator frameworks will be a better answer. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Apr 1 19:46:06 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:56 2004 Subject: Is validity an option? References: <7847B57C7C96D2119DBE00A0C96F64B6206DAE@cen1.cen.com> <37038A14.1FF1AC0A@prescod.net> <14083.38518.955372.918892@localhost.localdomain> Message-ID: <3703AE26.FC48043C@prescod.net> David Megginson wrote: > > No, actually, if the parser says that it can handle the SGML > declaration that it happens to have read from some random place on > your system, then you know that if your document happens to match that > SGML declaration you'll get out what you expect. That model sucks > too (even if it looked good on paper). I haven't had this happen to me in practice. I agree that the SGML declaration mechanism sucks but I've never had it silently fail on me. Usually it vociferously fails! But that's neither here nor there: both specs handle their optional features badly. SGML hides its option declarations too far from the data and XML doesn't have option declarations at all! > SAX2, on the other hand, can take a stab classifying > its parsers (as could the DOM). That would be helpful. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Other Operating Environments Will Have Trouble Keeping up with Linux's Growth" - http://www.idc.com/Data/Software/content/SW033199PR.htm International Data Corporation bulletin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Apr 1 20:53:07 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:56 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <050f01be7c6f$de7f6ea0$0b2e249b@fileroom.Synapse> aha!! that changes things :-) I had incorrectly assumed you were making an argument that DTDs *ought* only be used for validation to prevent the problem we have identified. Beyond requiring that external entities and default attributes be expanded, is there a way to allow non- and validating parsers to process the same XML documents in a functionally similar fashion, that is, the same SAX events be fired or the same DOM tree be constructed whether or not validation is employed? Jonathan >Jonathan Borden writes: > > David Megginson wrote: > > > > > >There *is* a potentially nasty problem lurking here: the DTD may > > >contain default values for attributes as well as validation > > >information. > > > > If DTDs *were* only for validation... > >As was probably clear from the rest of my message, the subject line >was meant to read "DTDs are not just for validation". > > >All the best, > > >David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Apr 1 21:07:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:10:56 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation In-Reply-To: <050f01be7c6f$de7f6ea0$0b2e249b@fileroom.Synapse> References: <050f01be7c6f$de7f6ea0$0b2e249b@fileroom.Synapse> Message-ID: <14083.50118.346144.196246@localhost.localdomain> Jonathan Borden writes: > Beyond requiring that external entities and default attributes be > expanded, is there a way to allow non- and validating parsers to > process the same XML documents in a functionally similar fashion, > that is, the same SAX events be fired or the same DOM tree be > constructed whether or not validation is employed? I'm hoping to have that worked out in the new core SAX2 features. AElfred is one good example of a non-validating parser that reads external entities and the external DTD subset. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.sun.com Thu Apr 1 21:10:46 1999 From: Jon.Bosak at eng.sun.com (Jon Bosak) Date: Mon Jun 7 17:10:56 2004 Subject: Last (first, and only) call: XML/DOM track at WWW8 DevDay Message-ID: <199904011910.LAA00719@boethius.eng.sun.com> Developers' Day at the Eighth International World Wide Web Conference will take place in Toronto, Ontario, May 14, 1999. Proposals for presentations in the XML/DOM track at Developers' Day are being accepted for one week only, April 2-9, 1999. XML and related standards constitute the future syntactic infrastructure of the Web. The XML/DOM track will present up-to-the-minute developments in Web-related technologies based on XML, XML Schemas, XLink/XPointer, XSL, and the DOM. Proposals featuring running code that has not previously been shown are of special interest and will be given priority in selecting the presentations. Co-chairs of the XML/DOM track this year are Jon Bosak of Sun Microsystems, Chair of the W3C XML Coordination Group, and Lauren Wood of SoftQuad Software, Chair of the W3C DOM Working Group. Proposals of 1-3 paragraphs clearly describing the presentation should be sent in plain text to both co-chairs: bosak@eng.sun.com, lauren@sqwest.bc.ca Submissions should be mailed no later than close of business Friday, April 9. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 1 21:27:13 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:56 2004 Subject: Best W3C Rec in a While Message-ID: <199904011927.OAA25906@hesketh.net> See http://www.w3.org/1999/04/REC-Reduced-set. Pretty impressive, with goals I think we all can share. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 1 21:37:49 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:56 2004 Subject: XML, Integration, and the Smaller Developer Message-ID: <199904011937.OAA26183@hesketh.net> A short paper I've written called "XML, Integration, and the Smaller Developer" is now available as a _rough draft_ at: http://www.simonstl.com/articles/xmlsmall.htm All comments, suggestions, etc. are welcome and will be credited. It's somewhat technical, but I hope it's high-enough level for non-techies to get the drift. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu Apr 1 21:51:46 1999 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:10:56 2004 Subject: RTF to XML Conversion Message-ID: <3.0.6.32.19990401204117.00b50380@gpo.iol.ie> All, I have a program pretty much ready to go that will take arbitrary RTF + a DTD + a narrative description of "the right markup to use" and generate excellent XML. Its written in Perl, and it is extremely easy to understand. Its at version 01.04.1999 in Europe and version 04.01.1999 in the US. If anyone wants a copy, let me know.... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pwilson at gorge.net Thu Apr 1 23:00:50 1999 From: pwilson at gorge.net (Peter Wilson) Date: Mon Jun 7 17:10:56 2004 Subject: com.sun,xml.parser LexicalEventListener improvements Message-ID: <3703DE62.81152425@GORGE.NET> I believe that the class LexicalEventListener is the proposed basis for the new javax.xml extension. In private conversation with Dave Brownwell of Sun Microsystems I made the suggestions below. While refusing these ideas Dave suggested that I poll the users on XML-DEV to determine their reactions. I hope you will agree with these suggestions and contact Dave via xml-feedback@java.sun.com. 1. The startElement() method should indicate if it is an empty element. i.e. whether the element ends with a /> tag or not. 2. The proposed LexicalEventListenser interface should have new methods startPCDATA() and endPCDATA(). Calls to these methods would bracket calls to the current characters(...) method. The LexicalEventListener interface already contains bracketing calls for start/endCDATA - why the inconsistency? Better yet, the characters(..) method should be split into two: CDATA(...) and PCDATA(..). The two method sets would then be startPCDATA(), PCDATA()*, endPCDATA. ditto for CDATA. Alternatively a single set: startCharacters(), characters(), endCharacters() could be used with a flag on startCharacters to indicate parsed or unparsed text. Dave argues that these method calls may be implemented by writing an event filter to restructure the events as required. By this logic, current extensions to LexicalEventListener are all equally pointless. Was the addition of the start/endCDATA methods added solely for his convenience in implementing XMLDocumentBuilder? Using the same argument they should not be cluttering up the new interface. My argument is that the new XML parsing facilities should not be skewed by the need of one application (e.g. building Dom models). This is best achieved by structuring lexical events for ALL syntactic structures. The current LexicalEventListener interface is a step in the right direction but is not complete. Peter Wilson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Apr 2 00:02:43 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:56 2004 Subject: RTF to XML Conversion Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF245@RED-MSG-08> Sean McGrath wrote: "I have a program pretty much ready to go that will take arbitrary RTF + a DTD + a narrative description of "the right markup to use" and generate excellent XML." You are probably infringing on intellectual property I developed years ago while at Virginia Polytechnic Institute and State University: the "//RUNRIGHT" program which would take any somewhat-correct program, plus samples of desired output and then yield a fully-corrected program. I assume you are using Bourbaki programming. If so, we need to talk. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Apr 2 00:08:34 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:56 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <05cf01be7c8b$2c853ba0$0b2e249b@fileroom.Synapse> David Megginson wrote: >Jonathan Borden writes: > > > Beyond requiring that external entities and default attributes be > > expanded, is there a way to allow non- and validating parsers to > > process the same XML documents in a functionally similar fashion, > > that is, the same SAX events be fired or the same DOM tree be > > constructed whether or not validation is employed? > >I'm hoping to have that worked out in the new core SAX2 features. >AElfred is one good example of a non-validating parser that reads >external entities and the external DTD subset. > Yes, but this behavior is up to the parser and this is the problem (getting back to IE5's default behavior). This behavior, as implemented by AElfred, IE5 etc. has no 'official' status in the XML spec, merely being described as "non-validating+"... We need a simple term to describe parsers which either 1) fire a standard series of SAX events and/or 2) construct identical DOM trees given a single XML document. Perhaps: well-behaved Jonathan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 2 01:23:38 1999 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 17:10:56 2004 Subject: DTDs are just for validation References: <000301be7bf1$54d91440$31aa97cc@server.total.net> <3703049D.4AE3A3CD@allette.com.au> <14083.20322.923520.236861@localhost.localdomain> Message-ID: <3703FF40.4DC0@hiwaay.net> David Megginson wrote: > > Marcus Carr writes: > > > Surely the overhead of looking at the DTD outweighs the benefit > > (none) obtained in rendering the above example? > > There *is* a potentially nasty problem lurking here: the DTD may > contain default values for attributes as well as validation > information. In the SGML version of DocBook, there is not a problem, > but what if the new version of DocBook had something like this? > > xmlns CDATA #FIXED "http://www.oasis-open.org/docbook/"> > > I suspect that industry practice will be always to run XML through a > normaliser before publishing, so that the attribute default values get > plugged right into the instance. There is no common practice to depend on and no means of specifying common support. Great. >From the X3D Contributors list, part of a design discussion about using XML for 3D where the issue is, is it necessary to use XML syntax or can hooks be built into VRML97. "Bullard, Claude L (Len)" wrote: > >... > So I guess the following may be what you after > > defname ID #IMPLIED > myVRMLThang NAME #FIXED "/whereItIs/itIs.wrl" > > > > > .... more Thangs > > >From Chris Marrin: Yes, we are almost there. Let's say I have the following VRML PROTO: PROTO myX3DThang [ eventIn SFColor changeColor ] { Shape { appearance Appearance { material Material { diffuseColor IS changeColor } } geometry Sphere { } } } This is in the file itIs.wrl. I would have to add the changeColor field to the DTD for myX3DThang. My DTD syntax is poor so I will leave the specifics as an exercise for the reader. Now I can say the following in a script: myThang.changeColor = "1 0 0" to change the sphere to red, correct? If that is true, then we have a fine syntactic connection. The remaining question is, how do I write an implementation that will actually render this? Once that is solved, the big design issues arise. How do I communicate data OUT of the PROTO? How do I propagate style into the PROTO? How do I get the VRML model of eventIn/eventOut/field/exposedField to the XML notion of attributes? How does the DOM concept of event listening fit in here? These are all design issues and hard decisions will have to be made... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Fri Apr 2 02:11:01 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:10:56 2004 Subject: SGML and XML Message-ID: <49092BAEAC84D2119B0600805FD40F9F120F1E@MDYNYCMSX1> >who can tell me the major difference between the SGML and XML, or >where can I find information See James Clark's "Comparison of SGML and XML" at http://www.w3.org/TR/NOTE-sgml-xml-971215. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Fri Apr 2 02:35:21 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:10:57 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: Perhaps DTDs are being used for too many purposes - validation and defaulting attributes/defining entities. The argument is made that once a document has been validated, there is no need to validate it again in a parser. Hence the concept of a conforming rather than validating parser. This is a good idea, but the details of attribute defaults and entity definitions get in the way. So, let's divorce the idea of validity from parsing. Instead of using a DTD use a URI that identifies the structure that the document conforms to. A DTD cannot describe all of the restrictions on the structure of elements in a document, the pattern syntax is too limiting. It may take the combination of validating against a DTD and then an application examining the resultant tree to truly define validity. There's no way to specify the set of valid zip codes or Visa card numbers in a DTD, but an application could verify them. A document may still may reference a DTD, but it contains default attribute values and entity definitions not element structure. The document doesn't declare how it is parsed (valid or conforming), The processing application that receives the document controls parsing. It may just request conformance parsing and its own code may default attributes and expand entities. Or, it may instruct the parser to parse according to an application specified DTD that the application knows corresponds to the URI in the document. A URI identifying element structure does not have to have a corresponding DTD. It may describe an application that has been coded to process it, such as \\IRS\1998\ScheduleD. Under this model: 1. Conforming parsers additionally can parse a DTD but only attribute and entity declarations. 2. A document can certify it conforms to a structure identified by a URI (certificate of authenticity). An application may be able to associate the URI with a DTD, or the URI may select an application that understands the structure. 3. A validating parser can have a DTD specified to it by the application using the parser and will use the element structure definitions in the DTD to validate the document, A little food for thought, Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Didier PH Martin [SMTP:martind@netfolder.com] Sent: Thursday, April 01, 1999 7:53 AM To: 'XML Dev' Subject: RE: Between raw and cooked II: Are? DTDs are just for validation HI Jonathan, If DTDs *were* only for validation there would be no issue here. However DTDs provide additional functionality beyond validation, namely default attributes and entities. The problem exists in that XML parsers can *choose* whether or not to validate and in so doing the information content of the XML document is altered. Validation is optional. Says so. Given this, the question becomes: ought parsers be allowed to expand entities and default attributes with validation turned off? What problem does this create? Perhaps the XML spec should properly specify that: *if* a DOCTYPE declaration is present which specifies a DTD then the document must be validated else the parser must generate an error. (DOCTYPE declarations would remain optional). In this way document authors would be able to properly specify information content. Thanks for bringing back the issue at its source: the spec. According to the spec nothing is said about how to interpret a document. It just say how a document is to formatted but not how it is to be interpreted. Now that real stuff is going out we see that holes are in the architecture. The holes being: what do we do with this? this question is dependent on type of interpreters like: a) browsers b) ERP front ends and back ends c) repositories d) any other stuff I am not think of right now there is no specs on how you do interpret or parse a document in the context of a browser. Your suggestion is a constructive one. You propose that the next spec version reduces the ambiguity on the parsing stage by including in the specs the parsing rule. the specs should also reduces the ambiguity with external references, so, to speak, to explicitly state if a parser should consider the presence of a DTD as a signal to validate the document. Actually it is leaved at the mercy of the implementer and no specifications are available to dictate the rules of conduct. Thanks Jonathan for a constructive comment. Any other constructive opinion? I mean here, any suggestions concerning the rules or more specifically the specs? Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Paul.Ananth at barclaysglobal.com Fri Apr 2 03:19:05 1999 From: Paul.Ananth at barclaysglobal.com (Ananth, Paul BGI SF) Date: Mon Jun 7 17:10:57 2004 Subject: How can I extract data from a XML Document Message-ID: Hi all, IF this is a beginner question or it is in the FAQ please let me know. I have a XML document. 1001 Pending 45669 1 Professional 449.95 22257 1 Modem Cable 17.95 47839 1 Port Scanner W/ Software 264.95 732.85 How can I extract the total amount from this file. How can I find there are three lineitems to process. (PurchaseOrder.lineitem.length)??? Thanks -Paul xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 2 03:48:08 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:57 2004 Subject: How can I extract data from a XML Document Message-ID: <3.0.32.19990401174734.00c4d9b0@pop.intergate.bc.ca> At 05:13 PM 4/1/99 -0800, Ananth, Paul BGI SF wrote: >I have a XML document. ... >How can I extract the total amount from this file. How can I find there are >three lineitems to process. (PurchaseOrder.lineitem.length)??? Use perl. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Apr 2 04:47:47 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:57 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <004201be7cab$1d2701a0$0df96d8c@NT.JELLIFFE.COM.AU> From: Marc.McDonald@Design-Intelligence.com >The argument is made that once a document has been validated, there is >no need to validate it again in a parser. One aim of XML was that documents should be parsed without DTDs. But it is useful that constants can be removed to a header. But a document is a living and organic thing: one of the key insights, to me, from SGML, is that a document's type can also include its future allowed values (in a basic domain, namely element structure). So removing content models from whatever header format is used is fine if you have a terminal document and you are only interested in the structure of that particular document, but it is not OK if you assume that the document may be altered at various stages and that you need to constrain the structures of all possible documents (of that type) to some extent (e.g., to prevent duplicated element types, to enforce controlled vocabularies for element type names, to disallow pathalogical structures). The asssumption of editability or non-editability changes everything. Perhaps this is another manifestation of the great literature-versus-database divide. (SGML's problem was that computer science theory did not (seem to) have enough to say about how to handle content modelsy: Fuji-Xerox's Murata Makoto's paper at XTech suggested that, in fact, there is relevant cs theory which may be very useful in bringing us forward. ) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Apr 2 19:26:33 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:57 2004 Subject: XML for EDGAR (stock database) Message-ID: <4.0.1.19990402120824.00e89b80@207.211.141.31> There's a story on Wired News - doesn't look like an April Fool's joke. http://www.wired.com/news/news/business/story/18911.html Basically, Invisible Worlds is setting up a mirror of the 40GB Securities and Exchange Commission database of corporate files and using XML to manage it. Also mentioned, something similar for IETF RFCs. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Fri Apr 2 19:33:29 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:10:57 2004 Subject: XML <-> non-XML filter project In-Reply-To: <00cd01be7a6c$0d879ac0$0300000a@cygnus.uwa.edu.au> Message-ID: At 99/03/30 13:13 +0800, James Tauber wrote: >Earlier this month, I posted the following to XSL-LIST. With apologies to >those who received it there, I'm posting it (modified) here to see if anyone >is interested in some co-operative effort in this area. > >What I would like to see is people taking existing non-XML formats and >developing: > > a) a URI for the non-XML format (for notations and for the namespace of >the XML format) > b) a DTD representing the existing non-XML format > c) an output filter to convert documents conforming to the DTD into the >non-XML format > d) (possibly) an input filter to convert the non-XML format into XML >... >I would personally find great value in this being done for Makefiles, >procmail files, simple shell scripts and PalmPilot databases. Others of >value I can think of include Windows INI files, Unix mailboxes, your >favourite programming language... I'm sorry I didn't notice it when reading XSL-list, but I found this last night on XML-DEV, so I'll post my response to both lists ... apologies in advance for the duplicates. The subject line implies *both* directions XML<->non-XML ... but your prose leans towards only XML->non-XML. I've just recently added this to my XSL training materials (X-Tech attendees didn't see it, WWW8 attendees will see it) because I have since successfully used XML and XSL to produce text-only files (including batch files, control files, etc.) using an environment created by James Clark (many thanks, James!) for his XT program: At Sun, 17 Jan 1999 10:34:34 +0700 James Clark wrote: ====8<---- Here's what the DTD for such a result namespace might look like: The nxml element is the root element; the encoding attribute is a MIME charset to be using for encoding characters as bytes. The data element contains data. Within a data element control characters get escaped. The escape element specifies how a particular control character gets escaped. The control element contains control information. Within a control element, all characters are output directly without escaping. The char element allows the output of a character that is not allowed by XML (such as control-L). ====8<---- The encoding= attribute works with the character set encodings supported by the Java engine running XT ... unfortunately, I haven't found a list of encodings for XT.EXE (Microsoft VM). The character sets that I think I'll need personally for all my text-only work are ISO-8859-1 (Latin 1), IBM Code Page 850 and UTF-8. >From the list of character sets in: ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets ... I found through trial and error that for the Symantec Java environment these are named "Latin1", "IBM85O" and "UTF8" respectively. HELP!!!! - Can anyone help me find the reference list of these (and other) character encodings supported by the Microsoft Java VM? Attached is the sample I wrote to help myself understand the features of the namespace. Once I found the encodings, I richly marked up in XML the source material for a number of simple text files and I now use XT to emit from the XML by using this namespace. So far it has covered what I personally need to emit non-XML text. I haven't yet needed to emit accented characters, but I'm ready with the encodings for my Symantec environment ... I'm hoping someone can help me find the encodings for the Microsoft Java VM. I hope this helps. ......... Ken P:\jclark>type nxml.xsl \\

- \

P:\jclark>type nxml.xml This is a test with a backslash \ and eacute ?? in it - plus the latin-1 for eacute as well P:\jclark>call xsljava nxml.xml nxml.xsl nxml.txt P:\jclark>type nxml.txt This is a test with a backslash \\ and eacute ? in it - plus the latin-1 for eacute \233-?\ as well P:\jclark> -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Fri Apr 2 19:40:17 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:10:57 2004 Subject: SAX2: DTDDeclHandler (minimalist position) In-Reply-To: <000b01be7b8f$22edc460$c8a8a8c0@thing1> Message-ID: At 10:57 AM -0500 3/31/99, Bill la Forge wrote: >From: Elliotte Rusty Harold >>>Using objects for constants can also cause problems with persistent >>>data, if you were depending on a singularity and testing with ==. >>> >> >>This isn't a problem with the syntax I've described because there is only a >>fixed set of objects in which identity comparisons are the same as equality >>comparisons. > > >How do you maintain singularities when deserializing a JavaBean which >contains a reference to one of these objects? > >That is to say, you have a constant which references an object. No problem. > >Now you have a bean with a variable which has been assigned the constant >value. No problem. > >Now you save the bean. No problem. > >Now you deserialize the bean. No problem. > >Now you test the value of the variable in the bean with ==. Woops. The test >always returns false. > You can use custom readObject() and writeObject() methods to keep singletons single if necessary. I discuss this very example (with a different singleton) in Chapter 11 of Java I/O. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 19:41:58 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:57 2004 Subject: How can I extract data from a XML Document References: Message-ID: <37050055.196FC7B5@eng.sun.com> > How can I extract the total amount from this file. Use DOM; there's a Document.getElementsByTagName() call that may be the fastest way to navigate to it in this instance. > How can I find there are > three lineitems to process. (PurchaseOrder.lineitem.length)??? DOM -- Document.getElementsByTagName("LineItem").getLength(). Not that DOM is perfect, but it's the best place to start at this time. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 19:51:02 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:57 2004 Subject: com.sun,xml.parser LexicalEventListener improvements References: <3703DE62.81152425@GORGE.NET> Message-ID: <3705026E.A80C02DE@eng.sun.com> Actually, that was for the context of the SAX2 discussions ... there's a thread I need to read (I was out on vacation while it was happening!) at http://www.lists.ic.ac.uk/hypermail/xml-dev/9903/0580.html The API mentioned below was essentially an earlier version of the proposal noted above. I expect that that the standard extension will follow SAX2 ... although I could get surprised on either front. So let me re-cast the question: should the SAX2 LexicalHandler stuff work as suggested below? - Dave Peter Wilson wrote: > > I believe that the class LexicalEventListener is the proposed basis for > the new javax.xml extension. In private conversation with Dave Brownwell > of Sun Microsystems I made the suggestions below. While refusing these > ideas Dave suggested that I poll the users on XML-DEV to determine their > reactions. I hope you will agree with these suggestions and contact Dave > > via xml-feedback@java.sun.com. > > 1. The startElement() method should indicate if it is an empty element. > i.e. whether the element ends with a /> tag or not. > > 2. The proposed LexicalEventListenser interface should have new methods > startPCDATA() and endPCDATA(). Calls to these methods would bracket > calls to the current characters(...) method. The LexicalEventListener > interface already contains bracketing calls for start/endCDATA - why the > inconsistency? > > > Better yet, the characters(..) method should be split into two: > CDATA(...) and PCDATA(..). The two method sets would then be > startPCDATA(), PCDATA()*, endPCDATA. ditto for CDATA. > > > > Alternatively a single set: startCharacters(), characters(), > endCharacters() > could be used with a flag on startCharacters to indicate parsed or > unparsed text. > > > Dave argues that these method calls may be implemented by writing an > event filter to restructure the events as required. By this logic, > current extensions to LexicalEventListener are all equally pointless. > Was the addition of the start/endCDATA methods added solely for his > convenience in implementing XMLDocumentBuilder? Using the same argument > they should not be cluttering up the new interface. > > My argument is that the new XML parsing facilities should not be skewed > by the need of one application (e.g. building Dom models). This is best > achieved by structuring lexical events for ALL syntactic structures. > The current LexicalEventListener interface is a step in the right > direction but is not complete. > > Peter Wilson > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Apr 2 20:20:12 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:57 2004 Subject: XML is broken (was Re: Why Doesn't IE5 use the DTD to Validate?) In-Reply-To: <37041B6A.5DD06D69@jclark.com> References: <001301be7c4b$a9896da0$31aa97cc@server.total.net> <3703F4D4.2226EBDB@w3.org> Message-ID: <199904021819.NAA05002@hesketh.net> At 08:20 AM 4/2/99 +0700, James Clark wrote on XSL-list: >So what is this switch? The DOCTYPE declaration? The DOCTYPE >declaration unless it's just an internal subset containing entity >declarations? What if I have default attributes declared as well? What >if I have so many entities that I use an external subset instead? Where >does the XML spec mention such a switch? > >I know Microsoft-bashing is good, clean fun, but actually they've done >the right thing here. Well, if IE 5 isn't broken, maybe it's time to consider (and discuss) whether the XML spec isn't broken, and badly. Validation is something that happens or it doesn't, depending on the whim of the application. Reading external resources is something that happens or it doesn't, again depending on the whim of the application. (That whim is slightly constrained by requiring validating parsers to read external resources.) Namespace support is something that happens or it doesn't at the whim of the application, and interactions with validation depend on another set of whims. On top of that, documents are free to identify themselves with any DTD they like and then create their own world in the internal subset. Is this really worth bothering with? After writing four books discussing the subject, I have to wonder more and more if validation and all the tools surrounding it aren't simply too broken to be useful. Validation as concept is great - applications can hand off certain types of processing to components, and everyone uses the same set of tools (schemas/DTDs) to describe what's supposed to be in those documents. Unfortunately, validation as implemented in XML is a painful joke: underpowered (no data typing), overpowered (attribute defaulting is a great idea, but doesn't always work in a nonvalidating environment), complicated (internal/external subset issues, not to mention IGNORE/INCLUDE), not reliable (since applications may or may not bother, and documents can change the rules anytime anyway), not constrained by 'industry practice' (since there isn't any consensus), and subject to a lot of intricate rules that take a long time to master. A better validation approach would: * Not interfere with well-formed documents (attribute defaulting done different) * Provide a simple mechanism for documents to identify their type, not all the details about their their structure. * Be reliable. Applications could control how documents are validated, instead of relying on the document to provide them with a roadmap. * Describe more than just text and elements. * Allow supporting tools (like XSL and XLink, which benefit greatly from a validating environment) to demand validation of documents against schemas before attempting processing. The current solution is an enormous mess, one that threatens to make validation a useless discard. I've complained about this to some extent previously (in the Layered Model document, and on XML-dev), but it's becoming a sorer point every time I encounter it, which happens pretty regularly. Maybe the schemas group can fix this, or maybe we should just chuck the declaration end of XML entirely. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 20:47:20 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:57 2004 Subject: Elements-Attributes-Data (was RE: SAX2 RFD: LexicalHandler draft v.1.1) References: <001901be7616$8e4caba0$c8a8a8c0@thing1> <001401be76c3$e0fd27a0$0100007f@eps.inso.com> <14074.38225.755412.932105@localhost.localdomain> Message-ID: <37050FAC.A00BD57E@eng.sun.com> David Megginson wrote: > > Actually, apps need to know about error messages too, but that wrecks > the litany. Everything else should be taken care of invisibly by the > parser. public abstract void repeatTillDone (Elements, Attributes, Data) throws SAXException; :-) Speaking of which, and surely a rats nest but one that's worth at least bringing up: does anyone think there is more to be standardized in the area of exceptions/diagnostics than just the warning/error/fatal distinction we have now? To elaborate a bit: normally, one wants to catch exceptions and recover from them to some degree. Different exception, different recovery -- if the peer closed the socket cleanly, there's probably been no error, but other sorts of I/O exceptions are trouble. There might be such issues with XML too; probably will be, over time. I can't think of any such issues related to parsing XML, at least right now I can't, but I'd like to know if anyone else has any. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 20:49:55 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:57 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903310453.AA00111@archlute.apsdc.ksp.fujixerox.co.jp> Message-ID: <3705103D.6CA71D19@eng.sun.com> MURATA Makoto wrote: > > David Brownell wrote: > > Again, no it doesn't. The idea is to get the web server to > > attach the correct MIME content type, which is NOT "text/xml" > > in many/most cases. Authors must rely on the administrator > > not breaking their content, and this is part of it. > > "application/xml" is appropriate for some XML data. On the other > hand, if you do not want to miss fallback to text/plain, "text/xml" > is the right choice. True -- but if there's one basic rule that seems safer than another, it's "default to application/xml" rather than "assume ASCII and stick to text/xml"! :-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Apr 2 20:50:38 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:57 2004 Subject: com.sun,xml.parser LexicalEventListener improvements Message-ID: <87256747.00675DD1.00@d53mta03h.boulder.ibm.com> >I believe that the class LexicalEventListener is the proposed basis for >the new javax.xml extension. In private conversation with Dave Brownwell >of Sun Microsystems I made the suggestions below. While refusing these >ideas Dave suggested that I poll the users on XML-DEV to determine their >reactions. I hope you will agree with these suggestions and contact Dave > >via xml-feedback@java.sun.com. > >1. The startElement() method should indicate if it is an empty element. > i.e. whether the element ends with a /> tag or not. > I do this currently in my internl callbacks and believe its the best thing to do, since it saves an extra callback and allows for better recreation of the original document (I just can't pass that info on via SAX right now.) I would definitely vote strongly for this. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Apr 2 20:55:25 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:57 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <87256747.0067CCE3.00@d53mta03h.boulder.ibm.com> >aha!! that changes things :-) > >I had incorrectly assumed you were making an argument that DTDs > *ought* only be used for validation to prevent the problem we > have identified. > On that subject, I think it would very much unconfuse a lot of things if there had been separate mechanisms for defining replacement texts, notations, etc... and the structural description of the target documents. Perhaps maybe the schema world could fix this, but if the fix is that the DTD gets kept just for the non-structural stuff and the scheme provides the structural stuff, I'm not so sure that that would be all that great (it certainly wouldn't make XML any easier for the user or developer, IMHO.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Apr 2 20:56:25 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:10:57 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <87256747.0067E644.00@d53mta03h.boulder.ibm.com> >>David Megginson wrote: >> >>There *is* a potentially nasty problem lurking here: the DTD may >>contain default values for attributes as well as validation >>information. > > If DTDs *were* only for validation there would be no issue here. However >DTDs provide additional functionality beyond validation, namely default >attributes and entities. The problem exists in that XML parsers can *choose* >whether or not to validate and in so doing the information content >of the XML document is altered. > > Validation is optional. Says so. Given this, the question becomes: ought >parsers be allowed to expand entities and default attributes with validation >turned off? What problem does this create? > Personally I think that the only thing that makes sense for the vast majority of situations is that the DTD is parsed if present and its (non-structural content) information is used, regardless of whether actual validation is done. Validation should be requested separately from the presence of the DTD, because of the DTD's overloaded use. This is the way the new IBM parsers work, and I think its the correct thing to do. Anything less than that should also be something that is specifically requested because otherwise it would probably just confuse the user. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 21:10:13 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: SAX2 RFD: LexicalHandler draft v.1.1 References: <14068.24150.843634.988657@localhost.localdomain> Message-ID: <370514B8.81193219@eng.sun.com> I'd have responded sooner, but this discussion started on the day I left for some vacation ... :-) Note that some of this feedback comes from having implemented versions of this functionality and from user feedback on it. (Based on earlier discussions, some on xml-dev.) That's in the latest parser from Sun (TR1); some folk might care to play with that code a bit. (There's also a version of DTDHandler extensions too -- essential! :-) Short summary: the basic idea is still right, though I think the DTD related stuff should be done a bit differently. - Dave David Megginson wrote: > > // LexicalHandler.java > // $Id: LexicalHandler.java,v 1.1 1999/03/21 02:49:41 david Exp $ > // SAX2 handlerID: http://xml.org/sax/handlers/lexical > > package org.xml.sax; > > public interface LexicalHandler > { > public abstract void xmlDecl (String version, > String encoding, > String standalone) > throws SAXException; I'd far prefer to drop XML declarations; if they're to be provided, I'd rather see a general text declaration facility (version and encoding) applying to all parsed entities. Then, standalone would look like the special case it is; perhaps with a callback just for that boolean value, when it's even provided. (Standalone is trivalue: yes, no, and unspecified.) > public abstract void startDTD (String doctype, > String publicID, > String systemID) > throws SAXException; > > public abstract void endDTD () > throws SAXException; These IMHO belong in the DTDHandler2 interface ! Also, we've found it essential to see the internal subset; it's most practical to report it as a single string. If one can't see that subset, one can't plan to round-trip the data in a document, and the ability to do that sort of round-trip is critically important. (Even though some folk want more data to pass through than others -- e.g. many don't care about CDATA boundaries, comments, etc.) In fact, what Sun did for this functionality was to partition it into three things (in DTD callbacks): startDtd (String rootName) endDtd () ... "start" has the declared root name externalDtdDecl (String publicID, String systemID) ... just for the unnamed [dtd] PE internalDtdDecl (String internalSubset) ... the literal internal subset This permits "safe" and complete recreation of the doctype declaration. > public abstract void startEntity (String name) > throws SAXException; > > public abstract void endEntity (String name) > throws SAXException; Right ... except that we pass a boolean "included" flag with the startEntity() call to meet the XML 1.0 specification requirement to report entities that aren't included (e.g. a nonvalidating parser of some types). To "pass through" one needs to be able to reproduce all entity refs, and the flag is needed to distinguish entities with no content from ones which just weren't read. As I noted earlier, and James did more recently, this can't apply to entities in attribute values. It needs to be specified/documented accordingly -- these callbacks must only apply to content. (I'll look at the proposal for attribute handling later.) There was also the issue of whether this is a general or a parameter entity ... we took the position that for sanity, we'd only present _general_ entities this way. For example, PEs inside markup declarations would be pretty useless. PE/DTD parsing can be a separate ("SAX3"? :-) set of features, and with any luck the popular tools will develop using XML-syntax schemas rather than PEs and that "SAX3" module won't ever need to happen; it'd need to be messy. > public abstract void comment (String text) > throws SAXException; > > public abstract void startCDATA () > throws SAXException; > > public abstract void endCDATA () > throws SAXException; Right, all this is basically needed in that form. > } > > // end of LexicalHandler.java xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 22:42:17 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation References: <050f01be7c6f$de7f6ea0$0b2e249b@fileroom.Synapse> <14083.50118.346144.196246@localhost.localdomain> Message-ID: <370529E1.24C976D8@eng.sun.com> > Jonathan Borden writes: > > > Beyond requiring that external entities and default attributes be > > expanded, is there a way to allow non- and validating parsers to > > process the same XML documents in a functionally similar fashion, > > that is, the same SAX events be fired or the same DOM tree be > > constructed whether or not validation is employed? Absolutely: when a nonvalidating parser reads all external entities, it behaves almost exactly like a validating parser that's configured to ignore validity errors. Differences are minor ... ignorable whitespace must be reported as such by the validating parser, as must unparsed entities. But nonvalidating parsers are free to report this info through SAX, and quite a lot of them do. Aelfred was the first, and Sun's does too (both validating and nonvalidating options). - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Fri Apr 2 22:49:17 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: Is validity an option? References: <7847B57C7C96D2119DBE00A0C96F64B6206DAE@cen1.cen.com> <37038A14.1FF1AC0A@prescod.net> <14083.38518.955372.918892@localhost.localdomain> <3703AE26.FC48043C@prescod.net> Message-ID: <37052C3D.9C677A5@eng.sun.com> > > SAX2, on the other hand, can take a stab classifying > > its parsers (as could the DOM). > > That would be helpful. The OASIS XML conformance working group has talked about this a bit ... it turns out that while there's only one type of validating processor, there are at least four types of nonvalidating ones based on what external entities they read: parameter entities ... then they normalize attributes and expand entities general entities ... then they don't drop content both ... gee, they're almost the same as a validating parser that doesn't report validity errors, and might not report ignorable whitespace or unparse entities (but they could do the latter if they want to) neither ... not all that interesting :-) Arguably one can make inclusion be conditional, but that rapidly gets nonsensical! I think that the "both" case is the most useful one in terms of "write once, run anywhere" portable code. - dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ckaiman at i3solutions.com Fri Apr 2 23:11:49 1999 From: ckaiman at i3solutions.com (Charlie Kaiman) Date: Mon Jun 7 17:10:58 2004 Subject: DB Industry answer to Oracle 8i is ... Message-ID: <01BE7D23.87A61B10.ckaiman@i3solutions.com> I just finished reading an excellent whitepaper on Oracle 8i, and it seems to me that they are way ahead of other Industry leader's like Sybase, and Microsoft (in terms of working with XML as an actual datasource). It appears that their initiative allows XML to be broken down using descriptors (equivalent to MS's XML Schema ????), and stored as relational data, as a text 'blob' (a long string of XML tags, I assume ????), or both. The whitepaper can be found at: http://www.oracle.com/xml/documents/xml_twp/ What I'm wondering is ... how are other companies, like Sybase, MS, etc. planning on implementing XML on the back-end? We've seen a lot in terms of rendering on the client (no doubt an important aspect), but very little on the back-end strategies. Can anyone point me in the right direction, in terms of learning more about this? Thanks much. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 2 23:34:40 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:10:58 2004 Subject: XCatalog -> XML Catalog Message-ID: <370537D4.EE6FD4E0@locke.ccil.org> Due to a name conflict with an established product, I am now referring to XCatalog using the generic name "XML Catalogs". The URL is unchanged: http://www.ccil.org/~cowan/XML/XCatalog.html . -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Apr 2 23:43:25 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:10:58 2004 Subject: XML is broken (was Re: Why Doesn't IE5 use the DTD toValidate?) In-Reply-To: <370529A7.9911C86D@us.dhl.com> References: <001301be7c4b$a9896da0$31aa97cc@server.total.net> <3703F4D4.2226EBDB@w3.org> <199904021819.NAA05002@hesketh.net> Message-ID: <199904022143.QAA08974@hesketh.net> At 12:33 PM 4/2/99 -0800, Sara Mitchell wrote: >James Clark's responses on the issue have cleared up the issue >from >my perspective. I agree that the XML spec is not as explicit as >it should be on what forces a validating parser to validate and >that >has allowed Microsoft to slide. But please don't suggest a whole >new set of rules! When the old rules don't work, it's time for a new set. Chuck the old, build the new, and don't feel bad about the transition. Accomodating the old is fine, but keeping the old at the expense of the new is going to cost XML a lot. Politically, I can see why the W3C wants XML to be stable, but at some point the cost of stability is higher than the cost of change. >We don't need another set of rules. I understand that much of >this >is awkward to people who are new to XML and it may not be clear >why some things have to be as complicated as they are. But adding >on a new set of rules just makes it more complicated, not less >so. I'm hardly new to XML; it's just taken me about two years to come to the conclusion that some things are irreparably broken. The validating/non-validating external resources/no external resources and namespaces/no namespaces issues are poison at the very heart of XML, not just little symptoms that can be brushed away. >Again, this is part of the strength of SGML and XML. Information >needs >to identify how it should be handled, don't stuff it in a >separate >application! There are certainly cases where it's advisable for a >receiving application to demand validation -- and other cases >where the >author needs to demand validation. But having that information in >the document itself is important. Part of the strength of document-oriented SGML and XML, but a disaster for data-oriented XML. The past is holding us back, as an old document-centric model denies us the ability to create schemas that have control over the document rather than the other way around. >> * Describe more than just text and elements. > >People are working on this area and it's appropriate for some >things. >But don't demand that documents with information for human >consumption >fit into a more rigid requirement needed for processing data. >There >are two audiences here, and the requirements for the information >should >fit the audience. Describing data more precisely seems like a win to me, whatever the application area. I have no problem with letting document structures remain just structures of text; I don't demand that every document identify its floating points and currencies, by any means. It does seem, however, that a tighter set of rules would better accomodate both document and data (and mixed) applications. >> * Allow supporting tools (like XSL and XLink, which benefit greatly from a >> validating environment) to demand validation of documents against schemas >> before attempting processing. >> [SNIP] > >This could be done quite simply by clarifying the XML spec to >make it >explicit that any presence of an ELEMENT declaration means that a >validating parser must validate. Then Microsoft can either step >back >up to the bar or make it clear that IE5 is not a validating >parser. That would be a start, but it still leaves many ugly problems unanswered. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Apr 3 00:11:44 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:10:58 2004 Subject: XML query languages and their encodings In-Reply-To: <370395C7.F93481AB@prescod.net>; from Paul Prescod on Thu, Apr 01, 1999 at 09:50:31AM -0600 References: <370395C7.F93481AB@prescod.net> Message-ID: <19990403081128.A5416@io.mds.rmit.edu.au> On Thu, Apr 01, 1999 at 09:50:31AM -0600, Paul Prescod wrote: > Mark Birbeck wrote: > > > > I sort of guessed it might be ;-) I was more getting at the idea of > > context. The following is a 'list of nodes': > > > > Mark > > Tracey > > Jan > > That's exactly my point. That's not a list of nodes. That's a list of XML > elements. Nodes are abstract. Here's a concrete representation for them > (and a containing element) for discussion purposes: > > x= element( gi: "names", > content: > element( gi: "name", content: text( "Mark")) > element( gi: "name", content: text( "Tracey")) > element( gi: "name", content: text( "Jan")) ) > > Now in this abstract model a "list of nodes" is: > > [x.content[0], x.content[1], x] > > Do I know their context? Yes. Do I know their depth? Can I talk about > nodes of different depths? Yes. In this brain-dead simple abstract model > those issues are not complex at all. > > Now if we want to encode these results for transmission between machines > then all of the issues you raise are important. But that is a *separate > issue*. It has nothing to do with the abstract concept of "node list". > > "XML People" are encoding-focused so they always come back to the > encoding. That's fine but it is also important to recognize that some > things should be considered in the abstract domain -- like the result sets > of query languages. This is usually, but not always, the case. At times, the abstract and concrete domains interact in nontrivial ways. In our own internal discussions on query models, the issue of determining context has always had an impact on the conceptual model. There are essentially two camps: 1. return just the node (or subtree). 2. return a "pointer" to the node. Returning a pointer allows one to go back to the original document and traverse up, down, back and forth at will. It is the most powerful mechanism. On the other hand, it also involves a considerable amount of network traffic as the user traverses a DOM-style tree across the client-server connection. Returning just the node (or, more generally, returning only what is requested) has the disadvantage of throwing away context. But this is only a problem if your query language can't express your requirements directly. For instance, say you want to know the name of the parent of each node returned. In an SQL-like language, one might express it thus: select **, parent(**).name from docs where //firstname?="Mark" where '**' represents anything returned from the XQL subquery. The parent function is defined to return the parent list of the given nodelist. If the environment provides sufficient support, one could even do this: define function toc(nodes) begin # define a function to return a table-of-contents. # ... end; select **, toc(**) ... This would then return the nodes of a query and the related TOC's. The essence of this approach is that you get back only as much context as you want. You also have the power to express whatever level or complexity of context you want (it also means the server can do the work without hops). Then again, if you are not interested in context, you don't wear the cost of having it. The real (and nontrivial) problem with this approach is that it requires non-portable environment support to express desired results with unlimited expressivity. In practice I think this is a problem people are happy to live with. In the relational world, people using Oracle code up functions in PL/SQL that can be invoked in queries on a regular basis. The point to be made from all this is that your query model will be greatly affected by delivery considerations. It is not necessarily a case of getting the model right and then worrying about how to return results in a practical setting. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amd0978 at acf3.nyu.edu Sat Apr 3 00:19:32 1999 From: amd0978 at acf3.nyu.edu (Adam M Donahue) Date: Mon Jun 7 17:10:58 2004 Subject: XML is broken (was Re: Why Doesn't IE5 use the DTD toValidate?) In-Reply-To: <199904022143.QAA08974@hesketh.net> Message-ID: > When the old rules don't work, it's time for a new set. Chuck the old, > build the new, and don't feel bad about the transition. Accomodating the > old is fine, but keeping the old at the expense of the new is going to cost xml-dev already developed XML-Schema. Why don't we work on a new XML, this time using a completely open process. XML as it stands is already a terrific base, save the problems you mention. At the very least this would be an exercise to see if these problems can in fact be cleaned up in the ways you suggested. Now I'll duck to avoid the flames this post is sure to generate :-) Adam xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sat Apr 3 00:25:13 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: SAX2: DTDDeclHandler (minimalist position) References: <14074.17776.784121.47587@localhost.localdomain> Message-ID: <370542B9.D5412869@eng.sun.com> David Megginson wrote: > > I'm still shying away from reporting element-type > declarations, at least until someone shows me an easy and concise way > of doing it (in AElfred, I simply provided the content model as a > fully-normalised string). That's what I'd prefer to see, and I really think that this interface should provide element type declarations! public void elementDecl (String name, String model) throws SAXException; Where "fully normalized" means spaces stripped, and PEs expanded. If there are interned constants for the strings "EMPTY" and "ANY", then it's fast to check for those; if the second character of the string is '#' it's mixed content, else it's "children". Also, for the record, I think that the DTD related methods you've currently put into the lexical handler should be here instead. Stuff like startDtd/endDtd, exposing the (optional) unnamed PE, exposing the internal subset (unparsed). (Separate comments on the "modified" handler you posted later.) - dave > ====================8<====================8<==================== > // DTDDeclHandler.java -- receive extended DTD declarations > > package org.xml.sax; > > public interface DTDDeclHandler > { > public final static int ATTRIBUTE_DEFAULTED = 1; This can be implied by a non-null value for "defaultValue". > public final static int ATTRIBUTE_IMPLIED = 2; > public final static int ATTRIBUTE_REQUIRED = 3; > public final static int ATTRIBUTE_FIXED = 4; I'd prefer to see these be strings ... constant strings in interfaces are all interned and can be compared as efficiently as integers (use "==" or "!="), but are simpler to use when debugging and programming. "#IMPLIED" etc. and if you write a constant "#IMPLIED" in your code, it's interned to the very same string value. There's a consistency point too: why have integers for these, and strings for the "type" (NMTOKEN, "ENUMERATION", and so on)?? There was the comment about using these values in switch statements ... except for large switches, they often get compiled into if/else/else/fi blocks anyway, so for this small a set of constants, I'm not sure there's a good reason to avoid strings. > public abstract void attributeDecl (String element, > String name, > String type, > String defaultValue, > int defaultType, > EntityRefList entityRefs) > throws SAXException; EntityRefList for holding enumerated values? I've been thinking of this in terms of two callbacks, one for non-enumerated values and the other for enumerated ones, and just taking an array of strings (for entity names or the enumeration options). There really are two distinct sorts of code path, and it's easier to branch early than to merge and re-branch later. > public abstract void externalEntityDecl (String name, > boolean isParameterEntity, > String publicId, > String systemId) > throws SAXException; > > public abstract void internalEntityDecl (String name, > boolean isParameterEntity, > String value) > throws SAXException; Exactly what Sun has (didn't we talk about this one once?) but we don't expose parameter entities at all. The good thing from exposing PEs: possible to enforce the namespace spec constraint that PE names have no colons. The bad thing: full PE handling gets really messy with stuff like processing declarations composed of multiple PEs, inside default values, nested, etc. I'd go for this API as it stands, but not expose PEs otherwise since that's the "slippery slope". Though I do like the idea Lars posted, reporting PEs through a separate API. > } > > // end of DTDDeclHandler.java xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sat Apr 3 00:29:03 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: SAX2: DTDDeclHandler (minimalist position) References: <199903270239.TAA08007@malatesta.local> Message-ID: <3705439E.899FEAEA@eng.sun.com> Lars Marius Garshol wrote: > > * uche ogbuji > | > | Furthermore, I've been thinking of proposing that the SAX2 > | interfaces be specified in IDL rather than Java (or at least > | publishing an IDL translatiuon when the interfaces are stabilized), > | and your proposal wouldn't wash in IDL. > > Many things in SAX won't wash in IDL, such as the use of the > Java-specific InputStream, Reader and Locale objects. > > Also, IDL has a problem in that it's sort of a least common > denominator, and thus leaves out many useful language-specific things. > So you'd probably want to do a manual translation anyway. > > If there ever is a published SAX spec I think it should use IDL to be > politically correct and point out potential language-mapping problems. > However, the actual utility of IDL I think is low in this particular > case. When we defined IDL at the OMG, we acknowledged that there was scope for language-specific binding attributes for a bunch of specific things. That didn't argue against using IDL for the majority of methods, where such language-specific issues don't crop up. There's no crime in writing, as the OMG did, specs that are IDL but where particular data types are defined as being language-specific, and not subject to the general mapping rules. In CORBA 1.0 and 2.0 we called that "Pseudo-IDL". - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sat Apr 3 00:41:27 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: SAX: Modified DTDDeclHandler References: <14075.41132.504650.207777@localhost.localdomain> Message-ID: <37054684.1D63D4C3@eng.sun.com> Looking only at this "elementDecl" proposal (deleting the rest): David Megginson wrote: > > // DTDDeclHandler.java -- receive extended DTD declarations > // $Id: DTDDeclHandler.java,v 1.1 1999/03/26 14:58:47 david Exp david $ > > package org.xml.sax; > > public interface DTDDeclHandler extends SAX2Handler > { > public final static int MODEL_ELEMENTS = 1; > public final static int MODEL_MIXED = 2; > public final static int MODEL_ANY = 3; > public final static int MODEL_EMPTY = 4; > > public abstract void elementDecl (String name, > int modelType, > String model) > throws SAXException; I guess I don't see why "modelType" is necessary. With this stuff it's sufficiently easy to look at the first one or two characters; the type can be resurrected cheaply whenever it's desired. OK, so my minimalist inclinations are showing. Remove four constants and a parameter, and the interface gets smaller. - Dave > } > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sat Apr 3 01:01:29 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:58 2004 Subject: SAX2: Proposed alternative DTD interface References: <14076.1733.365295.427943@localhost.localdomain> Message-ID: <37054B3C.ECABA1F@eng.sun.com> Lars Marius Garshol wrote: > > * David Megginson > | > | Here's another alternative for SAX2: forget about trying to report > | DTD declarations as events, and simply make the whole DTD available > | through an interface with a Parser2.get() call. > > I'm against this. Having an event-based/object-based dichotomy makes > sense for DTDs just as it does for document instances. Also, this > breaks with the rest of SAX, is relatively complex and will at some > point probably be in direct competition with the DOM Level X. Those are the first things that come to my mind, and they remain important. Purity Of Essence actually does matter in an API. The dilemma is that a parser really does need some objects inside, e.g. for attribute normalization and general entity inclusion, even after it completes the DTD. So it seems like it could be "cheap" to expose it as objects ... but what about the stuff that it gets rid of ASAP to reduce memory consumption? Notations aren't necessary after they've been reported, neither are some entities (unparsed and parameter). Don't force those to stick around. However, > Furthermore, this can be built on top of a 100% event-based SAX2. and that'd be maximally flexible in any case. Maybe an editor knows about some sorts of elements/attributes and has a specialized user interface for them. And the parser might prefer to discard all such info after it's done with a document -- something needs to save them appropriately, in any ase. I'd go for the event style DTD reporting, letting layers above SAX2 choose how they prefer to manage their DTD knowledge ... perhaps using DTDs as a subset of some richer schema data representation. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sat Apr 3 01:39:15 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:58 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation Message-ID: <065d01be7d61$005d5950$0b2e249b@fileroom.Synapse> David Brownell wrote: >> Jonathan Borden writes: >> >> > Beyond requiring that external entities and default attributes be >> > expanded, is there a way to allow non- and validating parsers to >> > process the same XML documents in a functionally similar fashion, >> > that is, the same SAX events be fired or the same DOM tree be >> > constructed whether or not validation is employed? > >Absolutely: when a nonvalidating parser reads all external entities, >it behaves almost exactly like a validating parser that's configured >to ignore validity errors. > You misunderstand me. I understand that it is possible for validating and non-validating parsers to generate the same parse tree, what I am looking for is a specification that parsers need meet to *ensure* that this occurs. It is this specification that needs to be given an official standing for use by XML applications. Unless I am otherwise missing something (which was the question), the only way to ensure that identical parse events and trees are generated by validating and non-validating parsers is to specify that external entities and default attributes etc. be expanded. Aelfred, the sun,ibm, and microsoft parsers provide identical parse trees *because* entities are expanded and attributes are defaulted. Everyone agrees that non-validating parsers *may* expand external entities and default attributes. The probem with XML is that this behavior is *optional* for non-validating parsers. This behavior is also the default behavior of IE5's parser. The problem is that this behavior, common to aelfred, ibm, sun and microsoft' parsers has no official standing, nor name, which is why I suggested that this be termed "well-behaved" (ok pick something less contentious but just pick something!!) and be given a specific standing in the XML spec. That way, as an XML application writer I can specify the needed features of the parser in terms of a spec. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sat Apr 3 01:53:58 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:10:58 2004 Subject: Is validity an option? Message-ID: <06a201be7d63$0f5a8f20$0b2e249b@fileroom.Synapse> David: > >I think that the "both" case is the most useful one in terms of >"write once, run anywhere" portable code. > Exactly. This parse tree content based classification (especially "both") is equally interesting for me as is validation vs. non-validation. Generation of validation error messages is something the *user* of the document decides the importance of while the content of a document is something the *author* is concerned with. Default attributes and entity expansions look alot like content to me. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Sat Apr 3 04:28:56 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:10:58 2004 Subject: Why Doesn't IE5 use the DTD to Validate? References: <000001be7c02$ac0cc750$1b19da18@ne.mediaone.net> Message-ID: <37057CC6.8BECE751@allette.com.au> Jonathan Borden wrote: > The reason to parse the DTD is that enternal entities and default attributes > are something which are very well needed client side... if entities were > left unexpanded by default this would change the 'meaning' of the document > itself, something which end users might be interested in :-)) My (and perhaps other's) confusion is apparent from the subject - we were assuming that by handing a DTD and an instance to what was believed to be a validating application should result in the validation taking place. Although it appears that Microsoft's handling is correct, I find the reputed requirement that the provider must insert JavaScript code (or a reference to a *.js file) in every document to be validated as detrimental to the openness of XML. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sat Apr 3 06:04:08 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:10:59 2004 Subject: Between raw and cooked II: Are? DTDs are just for validation References: <065d01be7d61$005d5950$0b2e249b@fileroom.Synapse> Message-ID: <37059219.17F263DF@eng.sun.com> Jonathan Borden wrote: > > You misunderstand me. I understand that it is possible for validating > and non-validating parsers to generate the same parse tree, what I am > looking for is a specification that parsers need meet to *ensure* that this > occurs. It is this specification that needs to be given an official standing > for use by XML applications. Would it be appropriate perhaps to have the default XML processing in the Java platform work that way? That'd cover at least one class of applications ... :-) I don't know too many folk who'd be upset at seeing this get changed in the XML specification, however, except that it'd be an incompatible change. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sat Apr 3 06:23:05 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:10:59 2004 Subject: XML query languages and their encodings References: <370395C7.F93481AB@prescod.net> <19990403081128.A5416@io.mds.rmit.edu.au> Message-ID: <37059338.4387E06F@prescod.net> Marcelo Cantos wrote: > > In our own internal discussions on query models, the issue of > determining context has always had an impact on the conceptual model. > There are essentially two camps: > 1. return just the node (or subtree). By this I think that you mean a *copy of the node*. > 2. return a "pointer" to the node. > > Returning a pointer allows one to go back to the original document and > traverse up, down, back and forth at will. It is the most powerful > mechanism. On the other hand, it also involves a considerable amount > of network traffic as the user traverses a DOM-style tree across the > client-server connection. I didn't claim that it was appropriate for the user to traverse a DOM-style tree across the connection. IMO, what happens after the node is found is not the concern of the query language. If the server returns a copy of the node: great! If the server returns a pointer to the node: great! If we want to invent formal languages for expressing which the client needs and for encoding either pointers or copies, that's great too. But all the query language should care about is pointers to nodes. > This would then return the nodes of a query and the related TOC's. > The essence of this approach is that you get back only as much context > as you want. You also have the power to express whatever level or > complexity of context you want (it also means the server can do the > work without hops). Then again, if you are not interested in context, > you don't wear the cost of having it. That's all fine for defining the context of queries. But you haven't addressed deletions, moves and arbitrary function application. Your query language is also not enough to support XSL. In XSL the result of one query is often used as the context of another. In my mind this is evidence that enforcing (as opposed to allowing) context-losing node copying is a dead end. > The point to be made from all this is that your query model will be > greatly affected by delivery considerations. It is not necessarily a > case of getting the model right and then worrying about how to return > results in a practical setting. You haven't said anything yet to make me think that a pointer-based approach is impractical. If a user wants a partial-context copy of data then XTL is the perfect, standardized language for making a copy of the data. If they need a pointer with full context (i.e. if they are *implementing XTL*) then you should be able to provide it. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Sat Apr 3 07:43:58 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:10:59 2004 Subject: Anti-Microsoft Flames Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF24C@RED-MSG-08> This mailing list occasionally has messages that note some feature or behavior of a Microsoft product, observe that it conflicts with the author's opinion on how the world should be organized, and impute some fiendish, nefarious and usually obscure motive to Microsoft. I don't generally comment on these. That does not mean that I believe they have merit; rather that I am very busy working with Satan and the International Trilateral Commission on an enigmatic master plan for global domination. Best wishes, Andrew Layman April 1, 1999 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Sat Apr 3 13:09:48 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:10:59 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <3705103D.6CA71D19@eng.sun.com> Message-ID: <199904031108.AA00165@archlute.apsdc.ksp.fujixerox.co.jp> David Brownell wrote: > True -- but if there's one basic rule that seems safer > than another, it's "default to application/xml" rather > than "assume ASCII and stick to text/xml"! :-) Or, "Use Apache (probably with the AddCharset patch), specify utf-16, and always use UTF-16." This is my favorite. (In the case that the charset is broken, autodetection of UTF-16 is very easy. Moreover, UTF-16 can parse only as UTF-16.) In my environment, I added a few lines to the "httpd.conf" file of Apache. They are as below: AddType "text/html; charset=shift_jis" htm AddType "text/html; charset=shift_jis" html AddType "text/html; charset=utf-8" htm8 AddType "text/html; charset=utf-16" htm16 AddType "text/xml; charset=utf-16" xml AddType "text/xml; charset=utf-8" xml8 AddType "text/xml; charset=utf-16" xml16 Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Apr 3 19:06:08 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:59 2004 Subject: Anti-Microsoft Flames Message-ID: <3.0.32.19990403090407.00b86910@pop.intergate.bc.ca> At 02:59 PM 4/1/99 -0800, Andrew Layman wrote: >rather >that I am very busy working with Satan and the International Trilateral >Commission on an enigmatic master plan for global domination. Since MS seems to be having severe trouble implementing the Document Object Model (cf IE5), I think they'll have to settle for gobal ination, which feels much less satisfying. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Apr 3 19:06:36 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:10:59 2004 Subject: XML for EDGAR (stock database) Message-ID: <3.0.32.19990403085712.00c6c230@pop.intergate.bc.ca> At 12:10 PM 4/2/99 -0500, Simon St.Laurent wrote: >Basically, Invisible Worlds is setting up a mirror of the 40GB Securities >and Exchange Commission database of corporate files and using XML to manage >it. I've been doing some work with these guys, and they're serious; keep an eye on them. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sat Apr 3 21:51:02 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:10:59 2004 Subject: About DOM and CORBA Message-ID: <000001be7e0b$69ef8f00$b7017bce@server.total.net> Hi, We are currently working on a C++ DOM mapping. We took the time to read again the DOM specs and the appendices contains un ambiguous definitions for: - java - Ecma script. I say un ambiguous because these language binary interfaces are already specified and a Java implementation form one vendor should inter-operate with an other java implementation because java classes exposes standard binary interfaces. Thus, we can say that Java and JavaScript are really inter-operable because of this common interface binary signature. We have problems now with CORBA and I need some check from the dev community to be sure we didn't made any mistake in our interpretations. Here is the problem. OMG specify that IDL defined interfaces could be mapped to different languages and specify certain language mapping like for instance C++. So far so good. However under this rosy universe a shadow of doubt submerged our mind. Do OMG specify a standard binary interface for all object? This is necessary if we want to have object truly be inter-operable (not on paper but in the real world). Moved by the doubt, we checked again the OMG CORBA 2.2 specs and found a remote execution interface named IIOP (using internet for inter-object communication or binding) but we didn't found any binary interface specification for local execution. I mean here for objects running in the same address space as a host executable kind of dynamic libraries or shared libraries. Are we right or we missed a chapter defining a specific binary interface specification? The implications of the lack of a standard binary interface definition for CORBA object leads to serious problems and more specifically to implementers following W3C CORBA interface definition as defined in the DOM level 1 appendix. If, as we think (but want to check with you the exactness of it) CORBA has no standard binary interface or standard binary signature for CORBA objects, then a manufacturer is free to implement its own signature. Thus, a certain manufacturer may choose a C++ type interface created with a VTBL or an other one may choose a different implementation with an other kind of binary interface. This would mean that an CORBA object would in practice be dependent on a specific manufacturer implementation and inter-operability would be more a wish than a reality. There is a chapter defining bridging with DCOM and some chapters on bridging CORBA objects among manufacturer. These last chapter just increased our doubts. The concrete implication of this is that if a parser with a DOM CORBA interface created with manufacturer X would not necessarily interface with manufacturer Y with a different binary signature without the usage of a translation bridge. The concrete implication is that we just added conversion overhead to the process. There is also a chapter on IIOP as an inter-object communication device. In this case, this implies that all communication between objects needs to a) serialize the call in IIOP format, uses a local port to communicate with the other object and then have the other object de-serialize the format into local function calls. Again, this process add unnecessary overhead to the process. If our interpretation is right, there is actually only two languages where W3C made clear recommendation and where objects created with such interfaces could communicate together without any ambiguity and a reasonable performance (no marshalling overhead). In reality, in the concrete world we should reduce this to a single language Java. Therefore the actual recommendation could be implemented and could run in the real world only for the Java language. Now, if someone want to implement the DOM with a C++ language may have serious problems with actual specs: a) OMG CORBA specify only interfaces but concrete manufacturer implantations could have their own idiosyncratic object binary signatures. b) If a C++ implementation uses CORBA, the only way to have real standardized communication between objects is through IIOP. This has the result of introducing an supplementary overhead and cancel all C++ speed improvement over interpreted languages. c) If a C++ implementation uses CORBA and bridging conventions, again supplementary overhead is introduced especially if object are in the same address space. Thus, C++ DOM implementation are not as formalized as Java. A C++ implementation is then felt in the cracks. Are we right to interpret OMG specs like that? Did we missed something? If not, C++ implementation are still in the no mans land and there should be an interface proposal either from W3C or from an other party.If we are right, we propose a XPCOM (Mozilla) or COM )Microsoft) type interface. This type of interface is based on a standard C++ VTBL as defined by ANSI. All interfaces inherit from a IUknown interface. this last requirement being optional and not mandatory. At least, C++ objects would be able to inter-operate because of their common binary signature. I hope we wont get stupid answers to this request for help and that people will take the time to think and do their homework before replying, We would all benefit from a sane debate on this. The goal is not to bash on any one but to find a solution to a problem. We will publish a paper to help otehr implementation to at least get the same binary interface and result in concrete re-usage of C++ objects implemented by different parties. Thanks in advance for your input. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Sun Apr 4 00:38:28 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:10:59 2004 Subject: About DOM and CORBA (long) References: <000001be7e0b$69ef8f00$b7017bce@server.total.net> Message-ID: <3706983C.7FA649BB@trivida.com> Didier PH Martin asked about issues involving a C++ DOM mapping and binary compatibility between different implementations of such a thing if they existed. Didier, The DOM IDL spec can be used just as it is if clients and servers communicate using CORBA. The implementation language on the server, nor the implementation language of the client stubs should not matter (and can be different). Almost all ORB implementations are heavily optimized for single-address-space or single machine client-server communication. The reason the Java and ECMAscript DOM bindings exist is for programs that do not use an ORB for communication -- those in the Java language can agree to use the Java bindings, and different DOM implementations in the same language can interoperate, as you mention. This is unnecessary if the DOM is a server object in a CORBA ORB (in fact, the idltojava program and the Jacorb idl compiler may produce different classes, but IIOP will handle any differences when a Jacorb client talks to an idltojava-based server.) The binary compatibility you think you are looking for is a feature of the implementation language, not CORBA or IDL (which make no promises to provide any and by design should not be expected to. A single ORB may support several different implementation languages without binary compatibility between the server objects they implement. The compatibilty is only at the IDL level.) It is probably a bad idea to attempt a special C++ DOM mapping. The C++ objects produced by different compilers are, by explicit design, incompatible (see, for instance, the chapter on linkage in the Stroustrup and Ellis ARM from 1990.) The layout of the objects is almost entirely up to the implementation, as are the placement (or use of) vtables, and the runtime type information. Each compiler uses a different name mangling scheme (by explicit design) so your link will fail if you try to mix code produced by different compilers! (If this didn't happen, the incompatibilities would cause various mysterious failures at runtime.) Binary compatibility even on a single platform is not part of the picture. Just in case I haven't made myself clear enough, suppose you buy the HighSpeedDOM dll which was written in C++ and compiled using compiler X. Now you get the DOM.h file which defines the classes and methods according to the new C++ DOM mapping. You include this in your C++ client code, compile your client code, which instantiates a DOM object and calls some of these methods, using compiler Y. When you try to link against HighSpeed's DLL, it will fail simply because the names of the methods in your compiled code and in HighSpeed's code don't agree. If by some dreadful accident, compiler X and compiler Y have chosen the same name mangling scheme, then the program will fail at runtime, calling the wrong method with the wrong arguments or looking in the wrong slot of the object for a data value, and probably memory will be corrupted. A company trying to produce a terrific DOM implementation in binary form must compile it with each supported compiler. There might be some advantage in performance to implement the DOM in C++ (that is, generate C++ server stubs for the IDL and fill them in with code). The idea would probably be that parsing a document might go faster than in Java, and writing out a DOM tree might be faster (these are examples of the larger or more complex operations that take place within the DOM code, just triggered by a single client interaction). Client inquiries and modifications of that DOM would have an extra overhead for passing through the ORB, but you'd hope this is small compared to the work done in the DOM implementation, assuming the client and server are on the same machine and preferably in the same address space. There are many factors which determine whether this hope is reasonable, but the moderate degree of success of CORBA indicates that at least sometimes, it is reasonable. The overhead would be significant or dominant on tiny operations like advancing an iterator. Given that there is no binary compatibility, no interoperability of code from different compilers, no direct C++ to C++ serialization protocol, and you might want to write a DOM client in some other language, the best thing appears to be to just implement the DOM in C++ if you think there are performance advantages or you have some other constraints, and call its methods through CORBA. When you implement a CORBA object in C++, there are two levels of incompatibility: 1. The different translation of IDL to C++ classes and methods by different ORBs. This is handled by IIOP between the client ORB which is calling and the server ORB which is responding. 2. The different binary form of the objects and methods produced by the C++ compiler. This is completely irrelevant to the client which might be compiled using a different C++ compiler, as the ORBs communicate essentially at the IDL level. Finally, it should be noted that the C++ vtable is in no way an element of any ANSI or ISO C++ standard. COM is a specified binary interface that was deliberately designed to look like (and hence be optimized for) C++ vtables produced by Microsoft compilers -- any implementation of COM objects produced by other compilers (on Intel or other platforms) must either have matching vtable implementations or use some other trick (most likely another layer) to realize the binary compatibility. This is why it has taken so long to have COM implementations (not widely used) on non-Windows platforms and why there are probably still interoperability problems between different platforms using COM. COM is a way of imposing a binary compatibility standard on top of C++ implementations, but adds non-trivial overhead also. I would see little reason to attempt to produce COM-based DOM interoperability rather than CORBA-based unless you know only the Windows platform would be used. Jeff xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Sun Apr 4 01:26:33 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:10:59 2004 Subject: About DOM and CORBA In-Reply-To: <000001be7e0b$69ef8f00$b7017bce@server.total.net> Message-ID: <000501be7e28$7da46530$0100007f@eps.inso.com> > OMG specify that IDL defined interfaces could be mapped to different > languages and specify certain language mapping like for > instance C++. So far so good. However under this rosy universe a shadow of doubt > submerged our mind. Do OMG specify a standard binary interface for all > object? I don't understand exactly what you mean by this question. The OMG specifies a standard mapping for *clients* such that code written to use C++ client objects will generally work without changes no matter which ORB libraries you compile against (that is the point of the standard mapping). IIOP provides an interoperable way for clients to talk to server implementations on any ORB. That said, the same is not true of the *server* objects (object implementations). Different vendors give various classes different names, so it is usually not possible to compile implementations created by one ORB vender withot changes if you choose to use a different ORB vendor. > This would mean that an CORBA object would in practice be > dependent on a specific manufacturer implementation and > inter-operability would be more a wish than a reality. Again, this is not true of clients, only servers. Clients can also interoperate across languages. The only real issue here is that some ORBs provide "fast-call" semantics such that object implementations bound into the binary of the application do not incur IIOP overhead. For a fine-grained interface like the DOM, this is crucial to good performance, but in general in an interoperable environment, you cannot take advantage of this. > The concrete implication of this is that if a parser > with a DOM CORBA interface created with manufacturer X would not > necessarily interface with manufacturer Y with a different binary signature without > the usage of a translation bridge. The concrete implication is that we just added > conversion overhead to the process. Again, for client code, interoperability is not a problem ( and even COM gatewaying is seamless). Anyway, one of the problems that I did point out to the DOM WG on a number of occasions is that by defining the JAVA and ECMA bindings for the DOM, we have given up interoperability there to a degree. For example, if you run the DOM IDL through a JAVA IDL compiler, the generated client and server interfaces will differ from those in the DOM spec. I think that realistically, DOM used CORBA IDL only to provide a *default* binding for languages defined by the OMG. From a practical perspective, the DOM interfaces are too fine-grained for serious use in a distributed environment, and "fast-call" semantics are not interoperable, so there is room for a standard definition of DOM bindings for C++. C++ bindings would include all kinds of things, like binary compatibility, memory management, etc, etc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Apr 4 09:38:21 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:59 2004 Subject: Anti-Microsoft Flames Message-ID: <005c01be7e66$0e116ea0$23f96d8c@NT.JELLIFFE.COM.AU> From: Andrew Layman >This mailing list occasionally has messages that note some feature or >behavior of a Microsoft product, observe that it conflicts with the author's >opinion on how the world should be organized, and impute some fiendish, >nefarious and usually obscure motive to Microsoft. I don't generally >comment on these. That does not mean that I believe they have merit; rather >that I am very busy working with Satan and the International Trilateral >Commission on an enigmatic master plan for global domination. Are you saying that if IE5 treats content like: <?xml encoding="UTF-8"?> as though the contents are a processing instruction (when CSS is used) that this is really a matter of "opinion on how the world should be organized"? I would have thought Microsoft would have wanted to distance itself from claims that it was providing gratuitous incompatibilities, given the Java fiasco. Getting delimiters right is a fairly basic thing: will there be a bug fix for this soon? If not, then it is a company decision to support non-spec product. I am a big fan of IE5; but what is the point of beta testing if the results are ignored for these kind of basic things...it suggests that Microsoft does not have systematic testing in place, or at least that it does not have sufficient programmer resources to handle all the bugs that thay find, or that it produces software which is too complex for the team sizes it uses. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Apr 4 10:11:37 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:10:59 2004 Subject: XML is broken (was Re: Why Doesn't IE5 use the DTD toValidate?) Message-ID: <007901be7e6a$b5d30050$23f96d8c@NT.JELLIFFE.COM.AU> From: Simon St.Laurent >I'm hardly new to XML; it's just taken me about two years to come to the >conclusion that some things are irreparably broken. Rubbish. You have been saying this on and off for the entire time you have been on XML-DEV Simon :-) And every time you do, everyone replies with variants of "X was really helpful" or "I found X quite useful" or "I didnt find X particularly useful" or even "X didn't help me" and everyone agrees "it would be nice to have more". It is becoming a ritual. The problems you mention come down to the same two, unless I am missing something: * There is currently no convention to name the effective document type definition and prevent extensibility using the internal subset. * XML has no data-typing (esp. on attributes and data content). The first is hardly surprising: it allows extensibility with strong typing (but its managebility is weak). Solutions (e.g. from HyTime) to this have been suggested at intervals, but to no avail. The second is the oldest complaint in the book; and the oldest answer in the book is "that is the trouble with layered, distributed development: sometimes layers lag or cannot be agreed on" (actually, all the consortia who are making families of DTDs in their particular domains are defining data types that they consider appropriate). The W3C Schema WG, of course, also works in this area. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Sun Apr 4 10:57:54 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:11:00 2004 Subject: Fw: Fw: XML query language Message-ID: <007701be7e78$a3cf26f0$5402a8c0@oren.capella.co.il> Paul Prescod wrote: >I wrote: >> Both XML-QL and XQL have ways to construct results (CONSTRUCT and >> ). > >There is no such element type described in >http://www.w3.org/TandS/QL/QL98/pp/xql.html Sorry, you are right, of course. I've picked it up from one of the papers of the QL conference - >> I'd also add: >> >> - We should have separate specs for XQL, XTL, and FOs. The XTL spec should >> simply reference the XQL spec. The FO spec should be independent. > >Techically a good idea but I think that it is politically impossible to >separate XSL and its matching language at this point. Maybe XSL 2.0 will >depend on whatever XML QL is eventually standardized. > >> - XQL should be used wherever a set of XML elements needs to be selected >> from an XML tree. >> - So therefore CSS should allow using XQL in its selectors. For that matter, >> CSS should allow an XML syntax :-) >> - And also XPointers? > >I agree with all of this but changes to CSS are unlikely in the >short->medium term. Not to mention the chances of unifying XPointers and XQL - technically a sound notion (even if the "XPointer" version is limited to selecting a continuous portion of the tree), but highly unlikely. I find it disturbing that something such as XML which is about as "new technology" as you can be is already suffering from so much backward-compatibility and "historical reason" warts. XSL which has never even seen version 1.0 is already showing these symptoms! Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Sun Apr 4 11:17:46 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:11:00 2004 Subject: Fw: Fw: XML query language Message-ID: <011c01be7e7b$6982ce90$5402a8c0@oren.capella.co.il> >Paul Prescod wrote: >>I wrote: >>> Both XML-QL and XQL have ways to construct results (CONSTRUCT and >>> ). >> >>There is no such element type described in >>http://www.w3.org/TandS/QL/QL98/pp/xql.html > > >Sorry, you are right, of course. I've picked it up from one of the papers of >the QL conference - Oops, the URL slipped. See: http://208.240.92.75/whitepapers/xql-design.html For example, and there's also a mention of in: http://www.w3.org/TandS/QL/QL98/pp/seligman.html Which refers internal W3C documents I can't reach. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Apr 4 15:07:51 1999 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:11:00 2004 Subject: Announcement : Tutorial : XML Scripting with Python at www8 Message-ID: <3.0.6.32.19990404135711.0092eec0@gpo.iol.ie> I am presenting an XML Scriping with Python tutorial at WWW8 in Toronto in May. See http://www.www8.org/tutorials.html#python for more details. regards, Sean Mc Grath xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 4 15:28:47 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> Message-ID: <37076818.54025D7B@w3.org> David Brownell wrote: > Chris Lilley wrote: > > > > What this RFC appears to do is remove author control over correctly > > labelling the encoding, and ensure that most if not all XML documents > > get incorrectly labelled as US-ASCII. > > Not at all. The best default MIME content type for all web > servers is "application/xml". Why? Do you consider anything not written in US-ASCII as a text document? I think the Unicode Consortium would disagree with you there. You don't actually show that application/xml is better, because you say: > Without a "charset=Big5" or > similar declaration, then the XML processor's autodetection > kicks in ... minimally handling UTF-8 and UTF-16, and quite > commonly handling a variety of additional encodings. If it has poor code to autodetect, it has poor code for both text/xml and application/xml. But it need not autodetect, in fact, autodetection is a bad thing. I was not suggesting autodetection, quite the converse. Rather, in the absence of an explicit MIME charset parameter, it should use the encoding declaration. If there is none, then the document is in UTF-8 or UTF-16 and the XML spec tells you how to determine which. [1]. If the processor is unable to deal with a particular encoding (8859-15, for example) then that is still the case whether the information was conveyed in a charset parameter on the MIME type (text/xml) or in the encoding declaration in the entity (application/xml). So, in what way is application/xml any better? So, the only difference between text/xml and application/xml in this regard is that the former *requires* the client to ignore the encoding declaration in the entity and forces an interpretation of US-ASCII in all cases. Now, the default for text/* over HTTP is ISO-8859-1 and the default for XML in the absence of an encoding declaration is UTF-8 or UTF-16. My position is that the most preferable option when registering text/xml would have been to use the rules in the XML spec (UTF-8 or UTF-16, unles there is an encoding declaration). > For example, Sun's XML processor handles about 140 encodings > at last count ... and _does_ conform to RFC 2376. You mean, when receiving a message body labelled as text/xml (via email or via HTTP) it ignores the encoding declaration, assumes US-ASCII, signals a fatal error because of invalid byte sequences in the file and then halts? Great ;-( > > So, this RFC removes at a stroke the possibility of authors correctly > > labelling the encoding of their XML documents and takes us back to that > > dark time (the present) when the majority of, say, Japanese Web content > > was mis-labelled. And it seems to have done this simply to save a very > > small part of coding effort for people writing transcoders. > > Again, no it doesn't. The idea is to get the web server to > attach the correct MIME content type, which is NOT "text/xml" > in many/most cases. So, your position is that since text/xml is unusable, best use application/xml instead? Surely it would have been better not to make text/xml unusable? Or if that was thought unreasonable, then why register text/xml at all? > Authors must rely on the administrator > not breaking their content, and this is part of it. Authors would love to rely on this, but have learned not to. The vast majority of content authors have *no control whatsoever* on server configuration. This isn't 1993; assuming that the person who wrote the content is also the person who administers the server is totally unwarranted. For 99.9995% of the folks, they sign up with an ISP; they get around 5Megs of web space and they are allowed to upload documents there. They share that server with thousands of other users. The server is not chosen by them, and is configured with all the default settings and the ISP will not change them no matter how many reasoned emails are sent by users. So, users cannot choose the MIME type that is used and certainly do not have the control to allow different documents to be served up with different MIME parameters depending on the encoding of their various documents. Which is my concern; control is removed from the users (who get to author the documents, and are in a position to do the right thing) and put in the hands of ISP administrators (who are installing new web servers at a rate of several a day, and do not want any special cases or anthing that is not right out of the box). Merely saying "so, ignore text/xml and use application/xml" does not help matters; its a workaround, not a solution. [1] http://www.w3.org/TR/REC-xml#charencoding -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 4 15:31:34 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903310453.AA00111@archlute.apsdc.ksp.fujixerox.co.jp> <3705103D.6CA71D19@eng.sun.com> Message-ID: <370768B9.BC58ED3D@w3.org> David Brownell wrote: > > MURATA Makoto wrote: > > "application/xml" is appropriate for some XML data. On the other > > hand, if you do not want to miss fallback to text/plain, "text/xml" > > is the right choice. > > True -- but if there's one basic rule that seems safer > than another, it's "default to application/xml" rather > than "assume ASCII and stick to text/xml"! :-) But that is only unsafe because this RFC made it so. Your tactic seems to be to work around the problem with text/xml registration by not using it; my approach is to get the text/xml registration fixed so it is not so unsafe. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 4 15:43:01 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904031108.AA00165@archlute.apsdc.ksp.fujixerox.co.jp> Message-ID: <37076B72.24475DCA@w3.org> MURATA Makoto wrote: > > David Brownell wrote: > > True -- but if there's one basic rule that seems safer > > than another, it's "default to application/xml" rather > > than "assume ASCII and stick to text/xml"! :-) > > Or, "Use Apache (probably with the AddCharset patch), > specify utf-16, and always use UTF-16." That is a reasnable choice for a single author to make; it is not a reasonable choice to impose on all authors everywhere. > This is my favorite. But not necessarily everyones favourite. It is a good choice for Japanese, because Kanji use less bytes per character in UTF-16 than in UTF-8. > (In the case that the charset is broken, autodetection of > UTF-16 is very easy. But autodetection should not be required; users can label their documents correctly. > In my environment, I added a few lines to the "httpd.conf" file > of Apache. They are as below: > > AddType "text/html; charset=shift_jis" htm > AddType "text/html; charset=shift_jis" html > AddType "text/html; charset=utf-8" htm8 > AddType "text/html; charset=utf-16" htm16 > > AddType "text/xml; charset=utf-16" xml > AddType "text/xml; charset=utf-8" xml8 > AddType "text/xml; charset=utf-16" xml16 Which illustrates my point exactly. You made some private conventions about filename extensions and you chose to reconfigure your server to understand those private conventions - and then, it works. These addtype lines would be quite unsuitable for a web server used by multiple users with multiple mother tongues. On the other hand, if the RFC had been written as I suggested, saying that a charset parameter overode *if present* but that *if absent*, the rules in the XML recommendation were followed, then you would need no server reconfiguration and the rules to follow to have the encoding information correctly conveyed to the client would have been a matter of public record in the XML recommendation rather than private convention. A big win for interoperability, if that had happened. Murata-san, you asked why a W3C team person was criticising this RFC in public. It is because the mission of W3C is to improve interoperability, so it is my duty to do so. Regardless of the esteem in which I may hold the author, I will argue against technical matters if I believe them to be wrong and to reduce interoperability. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 4 17:23:42 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: DOM References: Message-ID: <3707835A.30CCD013@w3.org> hassan.hussein@zurich.com wrote: > What exactly does DOM do and how does it work with XML parsers? Its a spec for an event-oriented interface to the parsing process > What exactly does SAX do and how does it work with XML parsers? Its a spec for an object oriented inteface to the parse tree, after parsing has been done > What exactly does SAXON do and how does it work with XML parsers? Its an implementation, not a spec. > Does a parser have to have DOM in order to work? > Does a parser have to have SAX in order to work? No. But they are both useful and desirable things to have, and more can be done if they are both suported. > What do you use in order to create an XML document programmatically? Whatever you want. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From claudio.vernacotola at crpht.lu Sun Apr 4 17:48:12 1999 From: claudio.vernacotola at crpht.lu (claudio.vernacotola@crpht.lu) Date: Mon Jun 7 17:11:00 2004 Subject: DOM Message-ID: <41256749.005BC73E.00@mmfileserver.crpht.lu> >> What exactly does DOM do and how does it work with XML parsers? >Its a spec for an event-oriented interface to the parsing process > >> What exactly does SAX do and how does it work with XML parsers? >Its a spec for an object oriented inteface to the parse tree, after >parsing has been done Shouldn't be exactly the opposite ?? i.e. DOM is a spec for object oriented interface and SAX an event oriented interface? Regards, Claudio xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Apr 4 18:16:30 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:00 2004 Subject: How to understand the CORBA IDL vs C++ problem... Message-ID: <000301be7eb6$7f692f20$275d8bcf@server.total.net> Hi, To better understand the problem with IDL mapping and concrete implementation in C++ here is a link to the DOM C++ implementation in Mozilla. Vidur had to translate the IDL interface definition into XPCOM for a _concrete_ DOM implementation in a browser. The binary interface could be created with most C++ compilers. In fact, the XPCOM spec is under stress test on several platforms. XPCOM could be mapped easily to DCOM on windows and still be XPCOM on other platforms (actually running on OS2 and Linux). For more info see the link: http://www.mozilla.org/newlayout/dom-roadmap.html The Mozilla DOM module is an independant module that can be used by any application. We are working now on a specification paper for other DOM implementations using a XPCOM binary signature implementation. Pacakging in either dll or shared library will be discussed too. regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sun Apr 4 19:59:34 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:00 2004 Subject: Announce: XPDL (was XML is broken) Message-ID: <199904041759.NAA25944@hesketh.net> Rather than whining about XML's inadequacies, I thought that maybe it was time to remedy them, or to make proposals for doing so. A first draft of XML Processing Description Language is the current result. For the most part, it's a cleaner replacement for the prolog (apart from the declaration, which should remain there.) XPDL uses XML syntax (and eventually RDF) to describe the constraints of a class of documents, to specify where attribute defaulting and entity values should come from, what styles are appropriate to the class of a document, and to provide the usual documentation and extensibility. This is only a first draft, and as such, is incomplete in some areas and may be questioned in all. (RDF being a biggie, but this time I've explicitly reserved space for it.) The rules for connecting XPDs (XML processing descriptions) to particular documents are somewhat vague, but I have reasonable hopes that this proposal is workable, if a departure from the current approach. All comments and suggestions are welcome. If there is enough interest, I plan to continue developing this with an open model much like that used for XSchema and SAX. This is only a first draft intended to incite discussion. It is not a part of any formal standards process. Details are at http://purl.oclc.org/NET/xpdl . Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 4 22:03:47 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:00 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <3.0.32.19990404125541.00bb1ce0@pop.intergate.bc.ca> At 03:24 PM 4/4/99 +0200, Chris Lilley wrote: > But it need not autodetect, in fact, autodetection >is a bad thing. I was not suggesting autodetection, quite the converse. > >Rather, in the absence of an explicit MIME charset parameter, it should >use the encoding declaration. If there is none, then the document is in >UTF-8 or UTF-16 and the XML spec tells you how to determine which. [1]. Just a terminology thing; I think when we say autodetection, we are talking about using the combination of the first few bytes and the encoding declaration, as described in app. F of the XML spec. I think (and I thought Chris thought) that this is a *good* and necessary thing, if only because lots of XML documents are read in other ways than via http, and because lots of times the web server simply doesn't/can't know about the internal arrangements of some XML resource. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sun Apr 4 22:16:12 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:00 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> Message-ID: <3707C75D.C303FC59@eng.sun.com> Chris Lilley wrote: > > David Brownell wrote: > > Chris Lilley wrote: > > > > > > What this RFC appears to do is remove author control over correctly > > > labelling the encoding, and ensure that most if not all XML documents > > > get incorrectly labelled as US-ASCII. > > > > Not at all. The best default MIME content type for all web > > servers is "application/xml". > > Why? Do you consider anything not written in US-ASCII as a text > document? I think the Unicode Consortium would disagree with you there. No, and that's not what I said: For a single world-wide default; that's easily understood by overworked, underpaid, often untrained sysadmins; and hence is NOT error prone (!!), there's a simple answer that's guaranteed to work right everywhere that pays more than lip service to industry standards), and hence is "best". Namely, that servers report XML documents as "application/xml". You seem to want to argue about the MIME definition of text as being ASCII, if otherwise unqualified. True, it's dated -- but it really does nobody any good to try imposing incompatible changes on such a foundational standard. The Web doesn't need that sort of confusion. > ... in fact, autodetection > is a bad thing. I was not suggesting autodetection, quite the converse. This seems like a new tangent: "autodetection is a bad thing". Are you proposing that the XML specification be revised to eliminate the several kinds of autodetection it's got? Note that that'd mean there's no way in general to read encoding declarations; folk need to autodetect to distinguish e.g. ASCII supersets (UTF-8, ISO-Latin-*, Shift_JIS, etc), EBCDIC encodings, UTF-16, and so on. So your preferred algorithm can't be implemented without autodetection. > Rather, in the absence of an explicit MIME charset parameter, it should > use the encoding declaration. [else default to UTF-8/UTF-16 per spec] That is _exactly_ the behavior specified for "application/xml"; now, what exactly is your reason for thinking it's not the best default for most everyone to use?? - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Apr 5 00:54:12 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:00 2004 Subject: Fw: Fw: XML query language References: <007701be7e78$a3cf26f0$5402a8c0@oren.capella.co.il> Message-ID: <3707EBD9.A6F9BBAF@prescod.net> Oren Ben-Kiki wrote: > > I find it disturbing that something such as XML which is about as "new > technology" as you can be is already suffering from so much > backward-compatibility and "historical reason" warts. XSL which has never > even seen version 1.0 is already showing these symptoms! This happens because XML is the target of intense politics and expensive development cycles. For instance the DOM should have been one of the last XML-family specs because it should be an API to all of the others. Instead it was the first because the market couldn't afford the ongoing bifurcation of "DHTML." -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Reduced * Set (R*S) is a design paradigm promoting simplicity over all other design constraints. R*S may be applied to all seven OSI networking layers. In fact, layers one through six may be simplified to the point of extinction, promoting the ultimate goal of reduced complexity and utility." - http://www.w3.org/1999/04/REC-Reduced-set xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 5 02:07:15 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: DOM References: <3707835A.30CCD013@w3.org> Message-ID: <3707C157.B1B99447@w3.org> Chris Lilley wrote: > > hassan.hussein@zurich.com wrote: > > > What exactly does DOM do and how does it work with XML parsers? > Its a spec for an event-oriented interface to the parsing process That was the answer to "What is SAX" and vice-verca. Sorry. > > What exactly does SAX do and how does it work with XML parsers? > Its a spec for an object oriented inteface to the parse tree, after > parsing has been done -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Apr 5 02:20:12 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:11:00 2004 Subject: XML TreeDiff algorithm ?? Message-ID: <3708008C.251C116F@jfinity.com> I know that IBM has provided a set of beans that "efficiently differentiate and update DOM trees"(www.alphaworks.ibm.com/formula/XMLTreeDiff). As far as I can tell, there is no description of the algorithms used. I did a quick look around and couldn't find any publications discussing tree diff/merge algorithms. Can anyone point me to articles or source on this topic? Thanks, Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 5 02:35:14 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:00 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: <7847B57C7C96D2119DBE00A0C96F64B6206DD5@cen1.cen.com> Message-ID: <370804A1.ACF19AA6@w3.org> "Sall, Ken" wrote: > > It seems useful to summarize some of the many issues generated directly or > indirectly by my original post Thanks, that was a useful condensing down of a number of threads. > - Should the XML Schemas Working Group address some of the holes in the > XML spec, especially in terms of conformance? I would hope so, but will need to check. > Should it be the job of the Infoset WG? Perhaps you could list the pros and cons of having it be made a work item of either of these WGs? > SAX2? That would be useful implementation experience; ideally, combined with proposals originating from one of the two WGs above. > Anyone want to add to this list? More importantly, anyone want to take a > crack summarizing what they believe to be the majority and/or minority > views? There seems to be general agreement that following external entiries is generally the desired behavior( majority case), but is not actually required by the spec. It seems the reason that nothing in the XML spec makes this required is so that it is possible to have a tiny XML parser (a minority case). This seems to lead to the conclusion that these two classes of application should have separate names, and separate conformance criteria. I don't sense consensus yet on whether client-side validation is always desirable; it clearly is in some cases and clearly adds little in other cases. The assertion has been made that client-side validation is a performance load, compared to just parsing the dtd looking for fixed attributes etc; but no performance figures were made available. If someone has a parser they could instrument and provide some actual measurements on real-world data, that would help. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 5 02:41:59 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 References: <3.0.32.19990404125541.00bb1ce0@pop.intergate.bc.ca> Message-ID: <37080629.223AA340@w3.org> Tim Bray wrote: > > At 03:24 PM 4/4/99 +0200, Chris Lilley wrote: > > But it need not autodetect, in fact, autodetection > >is a bad thing. I was not suggesting autodetection, quite the converse. > > > >Rather, in the absence of an explicit MIME charset parameter, it should > >use the encoding declaration. If there is none, then the document is in > >UTF-8 or UTF-16 and the XML spec tells you how to determine which. [1]. > > Just a terminology thing; I think when we say autodetection, we are > talking about using the combination of the first few bytes and the > encoding declaration, as described in app. F of the XML spec. Thanks for pointing out this source of terminological confusion. No, I was not meaning that. I was meaning autodetection in the sense of reading a whole bunch of the text and making assorted guesses based on frequency analysis and the like. In other words, automatic detection based on unlabelled content. I believe that this is a bad thing, because there is always the possibility (quite high) of hgetting it wrong. The encoding declaration, on the other hand, is not autodetection in that sense, it is a label. A very small amount of autodetection has to be done in order to be sure that the label has been read, that is all (ie, is this UTF-16 or is this an encoding where ASCII is represented as ASCII). > I think > (and I thought Chris thought) that this is a *good* and necessary thing, > if only because lots of XML documents are read in other ways than via > http, and because lots of times the web server simply doesn't/can't > know about the internal arrangements of some XML resource. Yes, this (rewading the encoding declaration) is a *good* thing, with the proviso that I am talking about the encoding declaration. I don't consider this autodetection, in thre same sense that reading is not autodetection of the version. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 5 02:48:42 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <3.0.32.19990404174756.00c72100@pop.intergate.bc.ca> At 02:39 AM 4/5/99 +0200, Chris Lilley wrote: >Yes, this (rewading the encoding declaration) is a *good* thing, with >the proviso that I am talking about the encoding declaration. Right; and also including (forgot to mention it last time) the BOM in the case of UTF-16 dialects. To me, the single most troubling thing in the discourse over the last few months is that the IETF is seriously considering a registration for UTF-16[LB]E which *forbids* the use of the BOM. Poor Martin Duerst has invested immense amounts of time and effort trying to persuade people that this is actually not rank insanity, but he's finding the going tough. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 5 02:49:18 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> Message-ID: <370807CC.CC9F0E1D@w3.org> David Brownell wrote: > > Chris Lilley wrote: > > > > David Brownell wrote: > > > Chris Lilley wrote: > > > > > > > > What this RFC appears to do is remove author control over correctly > > > > labelling the encoding, and ensure that most if not all XML documents > > > > get incorrectly labelled as US-ASCII. > > > > > > Not at all. The best default MIME content type for all web > > > servers is "application/xml". > > > > Why? Do you consider anything not written in US-ASCII as a text > > document? I think the Unicode Consortium would disagree with you there. > > No, and that's not what I said: But it is the implication of your argument. > For a single world-wide default; that's easily understood by overworked, > underpaid, often untrained sysadmins; and hence is NOT error prone (!!), > there's a simple answer that's guaranteed to work right everywhere that > pays more than lip service to industry standards), and hence is "best". > Namely, that servers report XML documents as "application/xml". I discussed this in my earlier mail and showed, in particular, that this is no more or less robust than text/xml; the client still gets given a label and still either knows what that label is or does not. > You seem to want to argue about the MIME definition of text as being > ASCII, if otherwise unqualified. No, I am arguing specifically about the default for text/xml - the registration can choose what that default is. > True, it's dated -- but it really > does nobody any good to try imposing incompatible changes on such a > foundational standard. The Web doesn't need that sort of confusion. Agreed, but that was not what I said. > > ... in fact, autodetection > > is a bad thing. I was not suggesting autodetection, quite the converse. > > This seems like a new tangent: "autodetection is a bad thing". > > Are you proposing that the XML specification be revised to eliminate > the several kinds of autodetection it's got? No, I was using autodetection in a different sense here, and it was valuable of Tim Bray to point this out. What the XML spec refers to as autodetection is not really autodetection. Its just reading a textual label, the same as reading a MIME charset label. > > Rather, in the absence of an explicit MIME charset parameter, it should > > use the encoding declaration. [else default to UTF-8/UTF-16 per spec] > > That is _exactly_ the behavior specified for "application/xml"; now, Yes > what exactly is your reason for thinking it's not the best default for > most everyone to use?? Because text files should be transmissible as text; XML is a format for marked up text. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Apr 5 02:52:24 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:11:01 2004 Subject: XML TreeDiff algorithm ?? Message-ID: <067a01be7efe$b2f1da20$0200a8c0@mdaxke.mediacity.com> This question should probably added to an xml faq somewhere; it comes up every 3.5 months on xml-dev: Steve Yost, 5/18/98, "Diff/Merge Tools?" mika.kikkonen, 9/11/98, "xml diff program" Mark D. Anderson, 12/20/98, "xml diff?" That last thread (which i started), resulted in references to several academic articles, a few of which were very interesting and just what i was looking for at the time. (it honestly escapes me however why i was looking into it then, but that is my problem...) It would be nice if ibm would just public domain their source, but someone probably thinks there is IP there. (maybe after ibm gets their patent application finished....) -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Apr 5 02:59:11 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:11:01 2004 Subject: XML TreeDiff algorithm ?? References: <067a01be7efe$b2f1da20$0200a8c0@mdaxke.mediacity.com> Message-ID: <370809B8.C7F63A2C@jfinity.com> Mark D. Anderson wrote: > This question should probably added to an xml faq somewhere; > it comes up every 3.5 months on xml-dev: I'm guilty as charged (by implication). I didn't search the archives :-( Thanks, Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Apr 5 04:12:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:01 2004 Subject: Anti-Microsoft Flames In-Reply-To: <5BF896CAFE8DD111812400805F1991F708AAF24C@RED-MSG-08> References: <5BF896CAFE8DD111812400805F1991F708AAF24C@RED-MSG-08> Message-ID: <14088.6807.311651.755654@localhost.localdomain> Andrew Layman writes: > This mailing list occasionally has messages that note some feature > or behavior of a Microsoft product, observe that it conflicts with > the author's opinion on how the world should be organized, and > impute some fiendish, nefarious and usually obscure motive to > Microsoft. I don't generally comment on these. That does not mean > that I believe they have merit; rather that I am very busy working > with Satan and the International Trilateral Commission on an > enigmatic master plan for global domination. Perhaps Andrew is unaware of the XML-Dev policy that silence automatically indicates agreement in any discussion (hence the number of substantially-identical postings on nearly every topic). By the way, I checked the GUID embedded by Outlook in the whitespace of Andrew's message, and it seems actually to have come from a hacker in New Jersey. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Mon Apr 5 04:21:37 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <37076818.54025D7B@w3.org> Message-ID: <199904050220.AA00169@archlute.apsdc.ksp.fujixerox.co.jp> Chris Lilley wrote: > The vast majority of content authors have *no control whatsoever* on > server configuration. This isn't 1993; assuming that the person who > wrote the content is also the person who administers the server is > totally unwarranted. To overcome this problem, Uchida-san is proposing a convention for WWW server configurations. His proposal is already used by some ISPs in Japan. It is available at: http://www.asahi-net.or.jp/~sd5a-ucd/docs/suffix_guideline_981106.txt It is hoped that this note will finally become a W3C technical note and that the I18N WG will encourage people to use it. Chris Lilley wrote: > > But not necessarily everyones favourite. It is a good choice for > Japanese, because Kanji use less bytes per character in UTF-16 than in > UTF-8. > > > (In the case that the charset is broken, autodetection of > > UTF-16 is very easy. > > But autodetection should not be required; users can label their > documents correctly. To me, the biggest advantage of UTF-16 is that UTF-16 XML documents can parse only as UTF-16. Even if the charset parameter is incorrect, UTF-16 XML documents do not parse incorrectly (and error recovery is very reliable). Chris Lilley wrote: > On the other hand, if the RFC had been written as I suggested, saying > that a charset parameter overode *if present* but that *if absent*, the > rules in the XML recommendation were followed, then you would need no > server reconfiguration and the rules to follow to have the encoding > information correctly conveyed to the client would have been a matter of > public record in the XML recommendation rather than private convention. > A big win for interoperability, if that had happened. At *IETF*, the default of the charset parameter for text/HTML *is* 8859-1. You might want to change this first. It is going to be very difficult or impossible, since HTTP and MIME people will disagree. Chris Lilley wrote: > > On the other hand, if the RFC had been written as I suggested, There have been a lot of discussion about this issue. None of your arguments are new to me. In fact, my original opinion was not so different from yours but I have changed my mind during the discussion. More about this, see the archive of the XML SIG (around April and May of 1998). > Murata-san, you asked why a W3C team person was criticising this RFC in > public. It is because the mission of W3C is to improve interoperability, > so it is my duty to do so. You might want to check what the W3C I18N WG has said to the XML CG. If W3C strongly recommends the use of the charset parameter, the world will change. XML is the last chance. I am strongly advocating the use of the charset parameter in Japan whenever possible. On the other hand, if even a W3C team member does not respect the consensus, there is not much hope. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Mon Apr 5 05:48:12 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> <370807CC.CC9F0E1D@w3.org> Message-ID: <3708314B.B98240BE@eng.sun.com> > > > > > What this RFC appears to do is remove author control over correctly > > > > > labelling the encoding, and ensure that most if not all XML documents > > > > > get incorrectly labelled as US-ASCII. > > > > > > > > Not at all. The best default MIME content type for all web > > > > servers is "application/xml". > > > > > > Why? Do you consider anything not written in US-ASCII as a text > > > document? I think the Unicode Consortium would disagree with you there. > > > > No, and that's not what I said: > > But it is the implication of your argument. How could it imply that? I didn't even talk about what "text" is, only about what MIME guarantees. And MIME only talks about what some specific content/media type categories mean, not about what "text" is. (I certainly hope you see how those are different!) See RFC 2046 and the discussion in section 4.1.2 for further information. It says eight bit or multibyte encoded "text/*" "MUST" use a "charset=..." property, which you seem to dislike; perhaps you were unaware that MIME has fundamental constraints in this area. RFC 2376 is being compatible with this fundamental Internet standard, which IMHO is the right idea. > > For a single world-wide default; that's easily understood by overworked, > > underpaid, often untrained sysadmins; and hence is NOT error prone (!!), > > there's a simple answer that's guaranteed to work right everywhere that > > pays more than lip service to industry standards, and hence is "best". > > Namely, that servers report XML documents as "application/xml". That requires _no conclusions_ about what is or is not "text". It only says that encoded text is most likely to be dealt with in the correct way if people label XML text as "application/xml". - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Mon Apr 5 07:37:02 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:01 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: <7847B57C7C96D2119DBE00A0C96F64B6206DD5@cen1.cen.com> <370804A1.ACF19AA6@w3.org> Message-ID: <37084ABE.1E182B80@eng.sun.com> Chris Lilley wrote: > > The assertion has been made that client-side validation is a performance > load, compared to just parsing the dtd looking for fixed attributes etc; > but no performance figures were made available. If someone has a parser > they could instrument and provide some actual measurements on real-world > data, that would help. Have a look at Sun's XML parser package ... there's a cost to validating, but the validating parser is as fast as many other nonvalidating parsers. That was one of the interesting conclusions we drew last summer: validation isn't all that expensive. The first several validating parsers just weren't written with efficiency in mind; we were 10-20 times (!!) faster than the others we compared against. The extra costs are primarily in checking whether the content model has been followed. Checking other validity constraints is pretty cheap. It should be very easy to see just what code kicks in differently in the validating and nonvalidating parsers; look at the code, you'll see what I mean. I've noticed something like a 10%-30% time price for validating; it's also true that the validation code hasn't really been tuned so it's likely the price can be reduced. I've also seen XML parsers that reduce such costs by not conforming to the XML spec, which approach I won't endorse! :-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Mon Apr 5 10:33:34 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:11:01 2004 Subject: About DOM and CORBA References: <000001be7e0b$69ef8f00$b7017bce@server.total.net> Message-ID: <37087566.19AB854F@toolsmiths.se> Hi Didier, To make a long story short, Corba does NOT support what I call "Component Binary Interface" CBI for short. This is what Microsoft have implemented within their COM. Some Corba's have however implemented "colocation" which is a shortcut route so marshalling is not performed when calling objects within the same adress space. See for an example. Here is a list of implementation options (faster to slower) 1.Private Component calls (not calling across Interfaces) No time penalty, same as normal C or C++ calls 2.Public Interface calls (calling other components using interface pointers) Same as C++ virtual function calls, slightly slower than above The MS COM and Mozilla way. 3.Optimized In-process Collocation call, to a Corba Object Adapter (OA) managed object. Call routed directly to object. This is outside the Corba spec. 4.In-process Collocation call, to a Corba OA managed object No marshaling, but call put in queue or rerouted by OA according to Corba guidelines. Corba Components implemented in same language. 5.Process to Process call, using Corba OA's Parameter marshaling and IIOP calls. Corba Components may be implemented in different languages. 6.Inter Machine Process to Process call, using Corba OA's Parameter marshaling, IIOP calls and network traffic. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Apr 5 15:16:25 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:11:01 2004 Subject: About DOM and CORBA In-Reply-To: <37087566.19AB854F@toolsmiths.se> References: <000001be7e0b$69ef8f00$b7017bce@server.total.net> <37087566.19AB854F@toolsmiths.se> Message-ID: * Anders W. Tell | | 5.Process to Process call, using Corba OA's | Parameter marshaling and IIOP calls. | Corba Components may be implemented in different languages. You don't say how you expect this to be implemented, and there are various options here. BEA M3 uses shared memory IPC to implement this, which (in my experience) is pretty fast. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Mon Apr 5 15:23:12 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:11:01 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <8525674A.00496100.00@D51MTA03.pok.ibm.com> >Because text files should be transmissible as text; XML is a format for >marked up text. XML can also be viewed/used as a text-based representation of data. In fact, I think that's its primary definition, with text markup being a single application thereof. (I have no opinion on the MIME-type issue as yet.) ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From LippmannJ at mmanet.com Mon Apr 5 15:28:00 1999 From: LippmannJ at mmanet.com (Lippmann, Jens) Date: Mon Jun 7 17:11:01 2004 Subject: XML Performance question Message-ID: <1CEC4A85AB34D21181C900A0C9CFE1279A77C5@NY_EXCH_01> Following the XML for the last couple month, I am surprised how little attention is paid to performance. My? optimistic personality leads me to?the conclusion that performance is not an issue. :) ? However, I would be very interested on an expert's guess on the following problem: ? Assume the following XML document: ? ?? ????? ???????? ??????????? ?????????????? 0815 ?????????????? 4289.23 ?????????????? 4289.23 ??????????? ???????? ????? ?? ? Each document will contain about 10^4 elements each will contain between 10 - 10^2 child tags, and I have to handle about 10^2 documents a day, i.e. we're dealing with 10^7 to 10^8 tags. So far,?the benchmarks I've got are pretty devastating.? I have to visit every?sub-element of?? at least once during the number crunching and I cannot keep everything in memory. I am considering one of the XML repositories to help me with the job. ? Any comment would be much appreciated. ? Jens xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Mon Apr 5 15:47:19 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:01 2004 Subject: About DOM and CORBA In-Reply-To: <37087566.19AB854F@toolsmiths.se> Message-ID: <000301be7f6a$b50f6ae0$4a02cdd1@server.total.net> Hi Anders, Many thanks, this is the first useful post I get on the subject. Thanks again Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: Anders W. Tell [mailto:anderst@toolsmiths.se] Sent: Monday, April 05, 1999 4:34 AM To: Didier PH Martin Cc: 'XML Dev' Subject: Re: About DOM and CORBA Hi Didier, To make a long story short, Corba does NOT support what I call "Component Binary Interface" CBI for short. This is what Microsoft have implemented within their COM. Some Corba's have however implemented "colocation" which is a shortcut route so marshalling is not performed when calling objects within the same adress space. See for an example. Here is a list of implementation options (faster to slower) 1.Private Component calls (not calling across Interfaces) No time penalty, same as normal C or C++ calls 2.Public Interface calls (calling other components using interface pointers) Same as C++ virtual function calls, slightly slower than above The MS COM and Mozilla way. 3.Optimized In-process Collocation call, to a Corba Object Adapter (OA) managed object. Call routed directly to object. This is outside the Corba spec. 4.In-process Collocation call, to a Corba OA managed object No marshaling, but call put in queue or rerouted by OA according to Corba guidelines. Corba Components implemented in same language. 5.Process to Process call, using Corba OA's Parameter marshaling and IIOP calls. Corba Components may be implemented in different languages. 6.Inter Machine Process to Process call, using Corba OA's Parameter marshaling, IIOP calls and network traffic. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Mon Apr 5 16:32:59 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:11:02 2004 Subject: About DOM and CORBA References: <000001be7e0b$69ef8f00$b7017bce@server.total.net> <37087566.19AB854F@toolsmiths.se> Message-ID: <3708C9A6.97FBBEAC@toolsmiths.se> Lars Marius Garshol wrote: > * Anders W. Tell > | > | 5.Process to Process call, using Corba OA's > | Parameter marshaling and IIOP calls. > | Corba Components may be implemented in different languages. > > You don't say how you expect this to be implemented, and there are > various options here. BEA M3 uses shared memory IPC to implement this, > which (in my experience) is pretty fast. You are right, I just put a few alternatives in a scale from fast to slower. Shared memory IPC is faster than most other IPC's on the same machine but its still significantly slower than the two step InProcess collocation calls that TAO uses. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Mon Apr 5 16:45:56 1999 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:11:02 2004 Subject: XML Performance question References: <1CEC4A85AB34D21181C900A0C9CFE1279A77C5@NY_EXCH_01> Message-ID: <3708CCA2.8BD6C292@toolsmiths.se> "Lippmann, Jens" wrote: > Following the XML for the last couple month, I am surprised how little > attention is paid to performance. My optimistic personality leads me to the > conclusion that performance is not an issue. :) I think you have to look at different UseCase's, for some XML is perfect or sufficient but for other you really should use something else. I dont agree with your optimistic view that perfomance is not an issue :). My personal view is that there a better solutions that XML for a number of important usescases and yours is probably one of them. XML is very useful for text information but often slow for data such as numbers. XML also handles many elements with long tag-name poorly. > > However, I would be very interested on an expert's guess on the following > problem: > > Assume the following XML document: > > > > > > > 0815 > 4289.23 > 4289.23 > > > > > > > > Each document will contain about 10^4 elements each will contain > between 10 - 10^2 child tags, and I have to handle about 10^2 documents a > day, i.e. we're dealing with 10^7 to 10^8 tags. So far, the benchmarks I've > got are pretty devastating. I have to visit every sub-element > of at least once during the number crunching and I cannot keep > everything in memory. I am considering one of the XML repositories to help > me with the job. What do you want to do with your document ? >From a brief look at your UseCase it looks like a relational DB should be better. Another comment is, why not use the event API (SAX if you are using Java) instead of putting the whole file in a DOM tree (in main memory) ? /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Apr 5 17:13:36 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:02 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <37087566.19AB854F@toolsmiths.se> References: <000001be7e0b$69ef8f00$b7017bce@server.total.net> Message-ID: Without intending to do so, I have devised an XML document that exposes many problems in almost all XML validating parsers and non-validating parsers that resolve external entity references. You will find this torture test at http://metalab.unc.edu/examples/players/index.xml It has broken every parser I've thrown at it in one way or another including the one in IE5 with the single exception of RXP. However RXP reports some warnings that do not appear to be errors, and missed some problems involving the lack of encoding declarations in the text declarations in an earlier version that xml4j 2.0.4 (but not 1.1.14) picked up. These have now been fixed. As best I can tell this document is both well-formed and valid. It's hard to say for sure when many different parsers all fail to process it, mostly after either giving up completely or generating incorrect error messages. Until I'm more confident the document is correct, I'm simply defining a broken parser as one that 1. describes a valid documbent as invalid (Microsoft?, xml4j?) 2. describes an invalid document as valid (RXP) 3. describes an invalid document as invalid but gives the wrong reason. (Microsoft?, xml4j?) Once I've conclusively determined whether my document is valid, I should be able to determine whether Microsoft, xml4j and xml4j fit into category 1 or 3 or both. What's torturous about this example is that it defines over 1000 separate external general entity references in several dozen different DTDs. Currently only one of those entities is actually used in the main document, but I plan to expand it to use all 1000+ entities. Thus it's likely to become even more difficult to parse properly. Leaving aside the question of whether this is the proper design for this document, it's nonetheless the case that parsers should be able to handle it. Parser authors may wish to investigate further. The assistance of anyone who can spot by eye mistakes I made that the parsers may be incorrectly reporting is appreciated. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Apr 5 17:55:17 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:02 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <3.0.32.19990405082309.00c4fcc0@pop.intergate.bc.ca> Message-ID: At 8:23 AM -0700 4/5/99, Tim Bray wrote: >At 10:06 AM 4/5/99 -0500, you wrote: >>Without intending to do so, I have devised an XML document that exposes >>many problems in almost all XML validating parsers and non-validating >>parsers that resolve external entity references. You will find this >>torture test at >> >>http://metalab.unc.edu/examples/players/index.xml > >Doesn't seem to be there. -Tim Oops. That should be http://metalab.unc.edu/xml/examples/players/index.xml +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Mon Apr 5 18:49:36 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:11:02 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <3.0.32.19990405124357.007d2530@polaris.net> At 10:57 AM 4/5/1999 -0500, Elliotte Rusty Harold wrote: >At 8:23 AM -0700 4/5/99, Tim Bray wrote: >>At 10:06 AM 4/5/99 -0500, you wrote: >>>Without intending to do so, I have devised an XML document that exposes >>>many problems in almost all XML validating parsers and non-validating >>>parsers that resolve external entity references. You will find this >>>torture test at >>> >>>http://metalab.unc.edu/examples/players/index.xml >> >>Doesn't seem to be there. -Tim > >Oops. That should be > >http://metalab.unc.edu/xml/examples/players/index.xml Elliotte, I just tried it at the U. of Edinburgh's Language Technology on-line validator, at: http://www.cogsci.ed.ac.uk/~richard/xml-check.html (Based on Richard Tobin's RXP parser.) This parser reports the torture test as well-formed when operating in non-validating mode; when validating, it issues only warnings, to the effect that the content model of the PLAYER element is non-deterministic, e.g. "After element GS there are multiple choices when the next element is S." It doesn't seem to be bothered by the entities. ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From claudio.vernacotola at crpht.lu Mon Apr 5 19:10:25 1999 From: claudio.vernacotola at crpht.lu (claudio.vernacotola@crpht.lu) Date: Mon Jun 7 17:11:02 2004 Subject: DOM questions Message-ID: <4125674A.0062BD24.00@mmfileserver.crpht.lu> Hi everybody. I'm beginning to get some insight in the DOM specs, nevertheless some points still seem dark to me: I would really appreciate if some one could illuminate myself (and others interested maybe) on the following points. Can an attribute have more than one entity reference as a child? What should a DOM implementation behave like if a DOM user appends to an Attr node an unexpanded entity reference that contains Element nodes as child ? In other words is it true that if an entity reference is appended to an Attr node, it can contain only Text nodes as children ? Can an Attr node have non null value and, at the same time, no Text children? When cloning an Attr node with deep set to false, does the cloned Attr node retain the attribute value? In this case should it also have a Text child? So, if the original attribute had its value splitted among several Text child nodes, the cloned attribute will contain only one text child holding the attribute value? Or is all this implementation specific (irrelevant to the DOM specs)? What should the getOwnerDocument method return for an Entity node (or for a Notation or for a DocumentType), as no createX method is specified in the Document interface? Last question: Form the DOM specs, the getOwnerDocument method should return null for a Document node. Why wouldn't be better to return the Document node itself, as this (seems to me) would simplify a little bit implementation efforts (For example when checking the owner Document of a Node when appending it as child of a Document node). Thanks a lot. Regards, Claudio. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Apr 5 19:47:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:02 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <3.0.32.19990405124357.007d2530@polaris.net> References: <3.0.32.19990405124357.007d2530@polaris.net> Message-ID: <14088.63020.909562.62441@localhost.localdomain> John E. Simpson writes: > At 10:57 AM 4/5/1999 -0500, Elliotte Rusty Harold wrote: > >>>Without intending to do so, I have devised an XML document that exposes > >>>many problems in almost all XML validating parsers and non-validating > >>>parsers that resolve external entity references. You will find this > >>>torture test at > >>> > >>>http://metalab.unc.edu/examples/players/index.xml > >> > >>Doesn't seem to be there. -Tim > > > >Oops. That should be > > > >http://metalab.unc.edu/xml/examples/players/index.xml > > Elliotte, I just tried it at the U. of Edinburgh's Language Technology > on-line validator, at: > http://www.cogsci.ed.ac.uk/~richard/xml-check.html > (Based on Richard Tobin's RXP parser.) > > This parser reports the torture test as well-formed when operating in > non-validating mode; when validating, it issues only warnings, to the > effect that the content model of the PLAYER element is non-deterministic, > e.g. "After element GS there are multiple choices when the next element is > S." It doesn't seem to be bothered by the entities. Neither AElfred nor XP reports any errors for this document either. Perhaps the problems that Elliotte exposes are the exception rather than the rule (I do not have Lark on my system right now, or I'd check it as well). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Mon Apr 5 20:32:56 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:02 2004 Subject: XML Torture Test: Parsers Fail References: Message-ID: <37090068.99DD48A6@eng.sun.com> For some reason I've seen three followups to this note, but not the original note ... Elliotte Rusty Harold wrote: > > >>Without intending to do so, I have devised an XML document that exposes > >>many problems in almost all XML validating parsers and non-validating > >>parsers that resolve external entity references. You will find this > >>torture test at > > http://metalab.unc.edu/xml/examples/players/index.xml I ran Sun's parsers and they warned of a variety of redefined entities ... for example, "&AaronSmall;" defined in two files, both "athletics.dtd" and "diamondbacks.dtd". (See the list below.) I did notice that this relies on correct interpretation of relative URIs, which I know have been handled incorrectly by at least two other "validating" parsers (perhaps not in their current releases though). - Dave ** Warning, line 1, uri http://metalab.unc.edu/xml/examples/players/diamondbacks.dtd Using original entity definition for "&AaronSmall;". ** Warning, line 21, uri http://metalab.unc.edu/xml/examples/players/indians.dtd Using original entity definition for "&JimPoole;". ** Warning, line 16, uri http://metalab.unc.edu/xml/examples/players/mariners.dtd Using original entity definition for "&GlenallenHill;". ** Warning, line 1, uri http://metalab.unc.edu/xml/examples/players/marlins.dtd Using original entity definition for "&AlexGonzalez;". ** Warning, line 20, uri http://metalab.unc.edu/xml/examples/players/padres.dtd Using original entity definition for "&KevinBrown;". ** Warning, line 11, uri http://metalab.unc.edu/xml/examples/players/pirates.dtd Using original entity definition for "&FreddyGarcia;". ** Warning, line 9, uri http://metalab.unc.edu/xml/examples/players/reds.dtd Using original entity definition for "&DennisReyes;". ** Warning, line 2, uri http://metalab.unc.edu/xml/examples/players/rockies.dtd Using original entity definition for "&BobbyJones;". ** Warning, line 33, uri http://metalab.unc.edu/xml/examples/players/tigers.dtd Using original entity definition for "&ScottSanders;". ** Warning, line 22, uri http://metalab.unc.edu/xml/examples/players/whitesox.dtd Using original entity definition for "&MarkJohnson;". xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Apr 5 20:43:51 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:02 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <001c01be7f8c$3c5bf100$4ef96d8c@NT.JELLIFFE.COM.AU> From: Chris Lilley >I was meaning autodetection in the sense of reading a whole bunch of the >text and making assorted guesses based on frequency analysis and the >like. In other words, automatic detection based on unlabelled content. I >believe that this is a bad thing, because there is always the >possibility (quite high) of hgetting it wrong. > >The encoding declaration, on the other hand, is not autodetection in >that sense, it is a label. A very small amount of autodetection has to >be done in order to be sure that the label has been read, that is all >(ie, is this UTF-16 or is this an encoding where ASCII is represented as >ASCII). In academic material, this is called "codeset announcement" (e.g. in the HANZIX OS from the mid 90s). The term "autodection" does give people the idea that guessing is involved. This is important, because if developers think that autodetection means guessing rather than codeset announcement, they may be tempted to guess encodings without alerting users that something seems strange: this would not be satisfactory for important documents. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Mon Apr 5 21:06:30 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:02 2004 Subject: regexp lib for java xml development? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2550E5@MDYNYCMSX1> Can anyone recommend a Java regular expression library for developing Java apps to process XML? OROINC's seems popular on the web, but I'm having trouble running the examples and was wondering if I'm missing anything good out there. thanks, Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Tue Apr 6 02:27:06 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:11:02 2004 Subject: XML Performance question In-Reply-To: <1CEC4A85AB34D21181C900A0C9CFE1279A77C5@NY_EXCH_01>; from Lippmann, Jens on Mon, Apr 05, 1999 at 09:25:48AM -0400 References: <1CEC4A85AB34D21181C900A0C9CFE1279A77C5@NY_EXCH_01> Message-ID: <19990406102643.A12514@io.mds.rmit.edu.au> On Mon, Apr 05, 1999 at 09:25:48AM -0400, Lippmann, Jens wrote: > Following the XML for the last couple month, I am surprised how little > attention is paid to performance. My? optimistic personality leads me to?the > conclusion that performance is not an issue. :) > > However, I would be very interested on an expert's guess on the following > problem: > > Assume the following XML document: > > > ?? > ????? > ???????? > ??????????? > ?????????????? 0815 > ?????????????? 4289.23 > ?????????????? 4289.23 > ??????????? > ???????? > ????? > ?? > > ? > > Each document will contain about 10^4 elements each will contain > between 10 - 10^2 child tags, and I have to handle about 10^2 documents a > day, i.e. we're dealing with 10^7 to 10^8 tags. So far,?the benchmarks I've > got are pretty devastating.? I have to visit every?sub-element > of?? at least once during the number crunching and I cannot keep > everything in memory. I am considering one of the XML repositories to help > me with the job. I just ran one million elements through SP with a scripting language on top of it. The run took 7m 15s. This extrapolates to 12 hours for 10^8 tags. This could easily be sped up by: 1. Using expat instead of SP (this is makes a _big_ difference). 2. Accessing the data from C++ rather than a script language. 3. Shortening your element names (currently they overload the data; they seem to incur roughly a 12% performance hit, and this would get much worse if you were looking for specific elements during parsing). I ran some brief tests, handling 10^6 elements with no processing (beyond parsing, that is), using expat in C. It completed in just under 2 minutes. This would suggest that 100 of the largest possible documents would take approximately 3h 20m. This extremely rough analysis suffices to establish some idea of the lower the bound for your problem. It doesn't address the full complexity of your situation, since we don't know the specifics of what you are trying to achieve. Also note that these figures were acquired using an event model, rather than a parse tree. This can have a significant impact on the performance. It may well be that your processing requirements don't permit an event-based approach, in which case the above figures are meaningless (this situation is less likely than is commonly perceived, however). Finally, note that this was all done in one thread (a 333 UltraSPARC). Multiple threads could potentially improve this figure substantially. Spreading the second test across 2 cpus brought the time down to 70 seconds (2 hours for 100 documents). Of course, this depends on your hardware. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Tue Apr 6 02:37:26 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:11:02 2004 Subject: regexp lib for java xml development? Message-ID: <30649320C177D111ADEC00A024E9F2971D22AE@exchange-server.dega.com> ORO has both AwkTools and PerlTools and OROMatcher. I think you need to combine OROMatcher with one of these to make it work. Please let me know if you get this to work and how. On a meta-note: I've seen some of the C/C++ GNU utils ported to Java. (Not many though.) Its a real shame that there isn't a more concerted effort to do this. Having GNU in my C++ days was a real godsend. I didn't have to re-invent many wheels or pay for good quality code to get the work done. Ed Ed Howland ed@dega.com http://www.dega.com "As your attorney, I advise you to take some adrenalchrome" > -----Original Message----- > From: DuCharme, Robert [mailto:DuCharmR@moodys.com] > Sent: Monday, April 05, 1999 12:10 PM > To: xml-dev@ic.ac.uk > Subject: regexp lib for java xml development? > > > Can anyone recommend a Java regular expression library for developing > Java apps to process XML? OROINC's seems popular on the web, but I'm > having trouble running the examples and was wondering if I'm missing > anything good out there. > > thanks, > > Bob DuCharme www.snee.com/bob snee.com> "The elements be kind to thee, and make thy > spirits all of comfort!" Anthony and Cleopatra, III ii > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Tue Apr 6 02:45:37 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:11:02 2004 Subject: XML Performance question References: <1CEC4A85AB34D21181C900A0C9CFE1279A77C5@NY_EXCH_01> Message-ID: <37095A6F.424917DF@lig.net> This is among the problems that I'm trying to address with what I was calling 'bXML' or 'binary XML' and which I'm not considering calling 'sXML' (collision?) or 'bsXML' as in 'structured XML' or more appropriately: 'binary structured XML'. Comments please! As you can see in previous posts, the idea was to maintain XML functionality but use a format that had structured encoding in flat character blocks that would allow fast searching and traversal. This will be a good test when ready to benchmark, pending. sdw "Lippmann, Jens" wrote: > Following the XML for the last couple month, I am surprised how little > attention is paid to performance. My optimistic personality leads me to the > conclusion that performance is not an issue. :) > > However, I would be very interested on an expert's guess on the following > problem: > > Assume the following XML document: > > > > > > > 0815 > 4289.23 > 4289.23 > > > > > > > > Each document will contain about 10^4 elements each will contain > between 10 - 10^2 child tags, and I have to handle about 10^2 documents a > day, i.e. we're dealing with 10^7 to 10^8 tags. So far, the benchmarks I've > got are pretty devastating. I have to visit every sub-element > of at least once during the number crunching and I cannot keep > everything in memory. I am considering one of the XML repositories to help > me with the job. > > Any comment would be much appreciated. > > Jens > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From zmin at atpage.com Tue Apr 6 03:24:36 1999 From: zmin at atpage.com (min zheng) Date: Mon Jun 7 17:11:02 2004 Subject: regexp lib for java xml development? References: <30649320C177D111ADEC00A024E9F2971D22AE@exchange-server.dega.com> Message-ID: <031401be7fcc$ea158f50$f66f6f0a@atpage> Check out this page: http://meurrens.ml.org/ip-Links/java/regex/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From boblyons at unidex.com Tue Apr 6 04:09:12 1999 From: boblyons at unidex.com (Robert C. Lyons) Date: Mon Jun 7 17:11:03 2004 Subject: regexp lib for java xml development? Message-ID: <01BE7FB0.2F4D0160@cc398234-a.etntwn1.nj.home.com> Bob, The GNU site (http://www.gnu.org/software/java/java-software.html) lists a free Java regex library called "gnu.regexp", which is available at http://www.cacas.org/~wes/java/. If you need a commercial product, then check out the "pat" package at http://javaregex.com/. It enables a Java application to compile and use perl5 regular expressions. It's only $10, and there is no per copy royalty fee if you are just distributing with your application or applet. You can download an eval copy for free. The documentation is good. I didn't find any bugs when I was evaluating it. A few months ago, I too was searching for a Java regex library for my XML app. The "pat" package was not fast enough for my needs. I ended up changing my design, so that I would not need a regex library. Bob ------ Bob Lyons EC Consultant Unidex Inc. 1-732-975-9877 boblyons@unidex.com http://www.unidex.com/ -----Original Message----- From: DuCharme, Robert [SMTP:DuCharmR@moodys.com] Sent: Monday, April 05, 1999 3:10 PM To: xml-dev@ic.ac.uk Subject: regexp lib for java xml development? Can anyone recommend a Java regular expression library for developing Java apps to process XML? OROINC's seems popular on the web, but I'm having trouble running the examples and was wondering if I'm missing anything good out there. thanks, Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jelks at jelks.nu Tue Apr 6 05:00:43 1999 From: jelks at jelks.nu (Jelks Cabaniss) Date: Mon Jun 7 17:11:03 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) In-Reply-To: <370804A1.ACF19AA6@w3.org> Message-ID: Chris Lilley wrote: > I don't sense consensus yet on whether client-side validation is always > desirable; it clearly is in some cases and clearly adds little in other > cases. Wouldn't it depend on what the client is? The creation of XML 1.0 (as opposed to just well-formed SGML) made validation optional; didn't the designers have browsers in mind when they made this decision; in fact wasn't it the MAIN reason they made such a decision? > The assertion has been made that client-side validation is a performance > load, compared to just parsing the dtd looking for fixed attributes etc; > but no performance figures were made available. If someone has a parser > they could instrument and provide some actual measurements on real-world > data, that would help. Assuming that validation were equally as fast, I still don't think that makes a case for *forcing web browsers* to do what XML 1.0 says is optional. In another message: > My feeling is that there are three classes of implementation, that > should all have names: > > minimal well-formed - never tries to follow external entities > full well-formed - always tries to follow external entities > full validating - always tries to follow external entities and validates Agreed. > and it should be possible to always derive what class of implementation > a particular instance requires. My current take on this is that > > "standalone="yes" is how you declare that a minimal well-formed parser > is sufficient; that Sounds good. > you indicate that a validating parser is required I don't like this (though evidently a number of people are assuming or advocating it). If validation is optional, it's optional -- even if there's a stray in the DTD. Maybe the author is building his DTD and doesn't want to validate it until he's good and ready. Maybe it's an older DTD, he doesn't care about validity any more, and all he wants are default attributes for styling purposes. Must he remove all and that all othger cases are saying that the full-well-formed parser is > required. That sounds good. But IMO "all other cases" should currently include documents having DTDs with Has anyone read the latest article on XML implementation in Office 2000? http://webreview.com/wr/pub/1999/04/02/edge/index.html?wwwrrr_19990402.txt What does the STANDARDIZATION of XML have to do with enforcing a software standard? If Lotus and Corel and all these other office suite vendors adopt XML as the standard reads, and Microsoft maims XML implementation so that an XML doc saved in Office 2000 can't be read by other vendors' applications, does it not become clear who is in the wrong here? No one (that I can think of) WANTS Microsoft to implement their own 'flavor' of XML. The whole point of XML (I gathered) was so that anyone's browser (or application) could open and read ranting.xml without fear of not having the "right" software to read the document. I have Office (I have no choice, since it's the "standard") but when I have to share something, If I can't fit it in Outlook (another "standard") then I attach a .txt file. I don't have to worry about anyone's application not supporting that. If it's too big, or needs graphics, then I post it on the web or intranet, in HTML 3.2 (no DHTML or stinkin' BLINK tags) and forget about it, cuz I know noone should have trouble reading it. Just like the guy in this article. Why not apply the same logic to his documents that he does to his webpages? Then he doesn't have to chastise anyone (like anyone using Linux) about not using Word. Ok, I feel a bit better now.... Jason A. Buss Single Engine Technical Publications Cessna Aircraft Co. jabuss@cessna.textron.com "I don't dislike Microsoft products... I still use Flight Simulator religiously..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Apr 6 14:46:46 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:03 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904050220.AA00169@archlute.apsdc.ksp.fujixerox.co.jp> Message-ID: <370A0194.C175B421@w3.org> MURATA Makoto wrote: > > Chris Lilley wrote: > > The vast majority of content authors have *no control whatsoever* on > > server configuration. This isn't 1993; assuming that the person who > > wrote the content is also the person who administers the server is > > totally unwarranted. > > To overcome this problem, Uchida-san is proposing a convention for WWW server > configurations. His proposal is already used by some ISPs in Japan. It is > available at: > > http://www.asahi-net.or.jp/~sd5a-ucd/docs/suffix_guideline_981106.txt Its good to see a concrete proposal. On the other hand, relying on a complex convention of filename suffixes is problematic: - it either requires content negotiation to be enabled (something not all servers can do) or it results in a mulitplicity of URIs for the same resource. - it requires all content authoring applications to know about it and to offer to save using this naming convention - it duplicates (and may contradict) the XML encoding declaration - the information stored in this way may be lost when saving local copies to systems which do not allow double dots in filenames or which have other restrictions. The XML encoding declaration, on the other hand, is much more robust in the face of the multiplicity of file systems in use. The second condition is made harder because two alternative syntaxes are proposed in this note - so a content auuthor has to know which convention is used on a particular server. Also, the note says that these are only two of many possibilities. An alternative method for achieving the same result is to use a filter (this can be done in Apache and in Jigsaw) which automatically emits the correct charset parameter based on reading the encoding declaration in the XML instance. Thi s can easily cache its results, and need not result in processing overhead on each request. Of course, this still requires work - for example, to ensure that it is included in the standard Apache distribution; but it is easier than trying to get the hundreds of authoring tools to support a couple of naming conventions which may in any case be hard to deal with on some platforms (platforms are still in use which have trouble with .html, for example ;-) > Chris Lilley wrote: > > > > But not necessarily everyones favourite. It is a good choice for > > Japanese, because Kanji use less bytes per character in UTF-16 than in > > UTF-8. > > > > > (In the case that the charset is broken, autodetection of > > > UTF-16 is very easy. > > > > But autodetection should not be required; users can label their > > documents correctly. > > To me, the biggest advantage of UTF-16 is that UTF-16 XML documents can parse > only as UTF-16. Even if the charset parameter is incorrect, UTF-16 XML documents > do not parse incorrectly (and error recovery is very reliable). I am wary of relying on error recovery. If it doesn't work well, then there is reduced interoperability because of variation; if it does work well, or seems to work well in some cases, then people just use it all the time. > Chris Lilley wrote: > > On the other hand, if the RFC had been written as I suggested, saying > > that a charset parameter overode *if present* but that *if absent*, the > > rules in the XML recommendation were followed, then you would need no > > server reconfiguration and the rules to follow to have the encoding > > information correctly conveyed to the client would have been a matter of > > public record in the XML recommendation rather than private convention. > > A big win for interoperability, if that had happened. > > At *IETF*, the default of the charset parameter for text/HTML *is* 8859-1. Yes, which is different to the default for text/* - this demonstrates that it is possible to give a more specific rule for a particular registration. I gave an example of a particular rule for text/xml which would have saved all this bother. > You might want to change this first. Why? It is XML we are speaking of here. > It is going to be very difficult or > impossible, since HTTP and MIME people will disagree. I think you mean, HTTP and Mail(SMTP/IMAP/POP). MIME is used by both email and HTTP. > There have been a lot of discussion about this issue. None of your arguments > are new to me. In fact, my original opinion was not so different from yours but > I have changed my mind during the discussion. More about this, see the archive > of the XML SIG (around April and May of 1998). OK, I will check this out. I cannot of course discuss such material in this forum, however. Perhaps you could post your technical reasons for the change of direction here? > > Murata-san, you asked why a W3C team person was criticising this RFC in > > public. It is because the mission of W3C is to improve interoperability, > > so it is my duty to do so. > > You might want to check what the W3C I18N WG has said to the XML CG. If > W3C strongly recommends the use of the charset parameter, the world will > change. Sure, in the absence of any other indication, server-applied labelling is certainly better than no labelling or guesswork. I have nothing against the use of the charset parameter. But, if it is not present, then the XML Rec says exactly what should happen; carefull wording which this RFC nullifies. Problems arise if an XML file is saved from the Web to a local filesystem, perhaps for further editing; the MIME charset information is lost. It could perhaps be stored in some way - but, there is already a standard way - the XML encoding declaration. And if the charset parameter is present, then it should say the same thing as the encoding declaration. The best way to ensure this is to treat the XML encoding declaration as the prmary metadata resource and to programatically derive the charset parameter from this; greater robustness is at once achieved and also harmonisation of the MIME and XML labelling. > XML is the last chance. I agree, it is important to get it right. > I am strongly advocating the use of the > charset parameter in Japan whenever possible. Great. On the other hand, you seem to be trying to do so by enforcing a different default charset than that in the XML Recommendation, which means that local files and remote files work differently; this is clearly not desirable. > On the other hand, if even a > W3C team member does not respect the consensus, there is not much hope. I think that last comment was beneath you, and would thank you to restrict yourself to technical argument. However, I will point out that it is the consensus of the XML 1.0 Recommendation that I am respecting - and that the RFC does not, by altering the meaning of the default encoding. It could have been harmionised with the XML REC; it was not. Redundancy can be good; a charset parameter and an XML encoding declaration that say the same thing and work the same way, which is what I was suggesting, is good. What you are suggesting, which is a charset parameter and an XML encoding declaration that work in different ways, is clearly suboptimal. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Apr 6 15:55:57 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:03 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) Message-ID: <002901be802d$24f65680$27f96d8c@NT.JELLIFFE.COM.AU> From: Chris Lilley >An alternative method for achieving the same result is to use a filter >(this can be done in Apache and in Jigsaw) which automatically emits the >correct charset parameter based on reading the encoding declaration in >the XML instance. I think this is the approach that, ultimately, we all are hoping will be deployed. We are having a variant of this at our site: for when serving XHTML one must * make sure the HTML meta tag is correct (for HTML conformance) * make sure the XML encoding is correct (for XML conformance) * make sure the MIME charset is correct (for MIME conformance) Three chances to get something wrong! (And not forgetting that some HTML editors push the metatags or PI around, so you may end up with duplicated tags relating to character set inside the document.) Given that an XML processor may transcode the document without knowing the meanings of the elements (i.e., that the meta tag means something), the XML encoding has to have priority over the HTML meta tag value. And given that a proxies can transcode text/* files without knowing what kind of text it is (i.e., that it is XML, and so has a label), the MIME header has to have priority over the XML header PI. I think that is the logical order: generic operations must be allowed. However, it is all spoiled if there are systems which corrupt the labels: for example by rewriting the charset parameter incorrectly. It is far better to send the XML file without a charset parameter than to send it with a wrong one. >And if the charset parameter is present, then it should say the same >thing as the encoding declaration. The best way to ensure this is to >treat the XML encoding declaration as the prmary metadata resource and >to programatically derive the charset parameter from this; greater >robustness is at once achieved and also harmonisation of the MIME and >XML labelling. Yes, with the exception that the XML encoding PI could itself be derived from internal data (e.g. in XHTML). All of them need to be harmonized. The problem isn't really "which should have precedence?" (because all systems will break somewhere, given the state of webservers and current awareness) as much as it is "how can move the Web towards safety and interoperability, where the markup now available at each stage is made available to the next?" I think the current XML media types for MIME gives the appropriate policy for charset preference (transcodability is one property of text/* which application/* must not have), but, as Chris is pointing out, mechanisms to set the MIME charset parameter from XML (and to overwrite it on delivery too) have yet to be put into place. >However, I will point out that it is the consensus of the XML 1.0 >Recommendation that I am respecting - and that the RFC does not, by >altering the meaning of the default encoding. It could have been >harmionised with the XML REC; it was not. I think the XML SIG and WG pretty much all had concensus on the RFC at the end, in full knowledge of XML 1.0. But I think many of us came out of it thinking that it is safer to use application/xml. In particular, I think that a mismatch between the XML encoding declaration and the MIME charset (and the XHTML met tag) should some kind of weak Reportable User Error: people who don't want to accept transcoded text which has been mislabelled should have some kind of user option to report the error or abort. application/xml for safety text/xml for reach Rick Jelliffe Academia Sinica Computing Centre Taipei, Taiwan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Apr 6 16:46:55 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:03 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... In-Reply-To: Message-ID: <199904061446.KAA13790@hesketh.net> At 07:23 AM 4/6/99 -0500, Buss, Jason A wrote: >No one (that I can think >of) WANTS Microsoft to implement their own 'flavor' of XML. The whole point >of XML (I gathered) was so that anyone's browser (or application) could open >and read ranting.xml without fear of not having the "right" software to read >the document. As far as I've seen (and I've avoided Office 2000 as much as possible) MS-XML is just another XML-based format. I think they combined the worst of HTML with the worst of XML in the particular way they implemented it, but they have their own way of doing things. The hardest issue I see with MS-XML is convincing people that there is more to XML than this particular implementation. Getting them to move from bloated documents full of extra junk to more streamlined XML documents taking advantage of style sheets (not just style elements) and meaningful content structures will probably become more difficult as a result of this product. And that, I think, is exactly how certain companies want it. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Tue Apr 6 16:59:40 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:11:03 2004 Subject: DOM questions Message-ID: <8525674B.00523315.00@D51MTA03.pok.ibm.com> >Can an attribute have more than one entity reference as a child? Sure. Consider Note that parsers are allowed to expand entity references by replacing them with their values, which makes round-tripping impossible, but makes accessing the content either -- pick the parser, or parser settings, that suit your needs. >What should a DOM implementation behave like if a DOM user appends to >an Attr node an unexpanded entity reference that contains Element nodes >as child ? In other words is it true that if an entity >reference is appended to an Attr node, it can contain only Text nodes as >children ? A parsed entity, in other words. I _think_ the answer is that parsed entities aren't acceptable as Attr values per XML specs, and while I don't think the DOM spec explicitly protects you against this I don't think implementations are required to do anything useful either. >Can an Attr node have non null value and, at the same time, >no Text children? Sure. >When cloning an Attr node with deep set to false, does the cloned Attr >node retain the attribute value? No. The definition of shallow clone is that children are not copied. However, shallow-cloning an Element node will DEEP-clone the attributes on that element. (Now, if your question is whether there is such a thing as a shallow clone of an entity reference... I think that may be implementation dependent. My DOM keeps references in synch with their definitions, so shallow or deep clone of EntityReference wind up being the same thing.) >So, if the original attribute had its value splitted among several Text >child nodes, the cloned attribute will contain only one text child holding the attribute value? If you do a deep clone, you will copy _all_ the children, exactly as they are. If you do a shallow clone, you will copy _none_ of the children. There is currently no Attr.normalize() operation. I've lobbied for that to be introduced in DOM Level 2. >What should the getOwnerDocument method return for an Entity node >(or for a Notation or for a DocumentType), as no createX method is >specified in the Document interface? DTD support is essentially incomplete in DOM Level 1. Don't expect it to be both portable and useful until Level 2, if then. Most of us are either advising folks not to try, providing nonportable extensions to work around the gap (such as the factory methods which you noticed are missing), or both. As another illustration of the problem, note that DOM Level 1 specifies the behavior of default attributes, then provides nowhere to store the information describing those defaults. (Basically, the whole DocumentType thing was deferred in the hope that schemas would appear relatively quickly. They haven't, and there still seems to be some fear that any stopgap solution written into the spec would be both incompatable with schemas and impossible to get rid of.) >Form the DOM specs, the getOwnerDocument method should return null for a >Document node. True. Don't Ask. This bounced back and forth; the point-to-self folks lost the final volley. It's not a big deal either way. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Tue Apr 6 17:05:30 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:03 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2550F5@MDYNYCMSX1> >Has anyone read the latest article on XML implementation in Office 2000? I think that a better name than "MS-XML" would be "XRTF," since it looks like an XML app whose goal is to replace RTF. Any replacement for "rich" text "format" that conforms to a real DTD that we can all find somewhere is a Good Thing, but as Simon pointed out (I think), the problem with the article is that it doesn't put this MS-XML into perspective as one XML application among many out there. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Apr 6 17:08:52 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:11:03 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... Message-ID: Could look at it another way and say that now anyone can create a file that could be edited or viewed by MS tools. For example, you could have a database of information that is exported as XML and then pushed through one stylesheet to make it viewable by MS PowerPoint, and another to make it editable by MS Word. And of course there's no reason why you couldn't use another stylesheet to edit the relevant data in a Lotus spreadsheet. None of these applications need even exist on the web server. (I thought open file formats and structures was one of the goals of XML.) BTW, it would save a lot of time if someone came up with a stylesheet that transforms everything into a conspiracy theory. I could then save space on my hard-drive by just storing the original news item, and then just use the stylesheet to 'view' the conspiracy theory when I needed to. ;) Regards, Mark > -----Original Message----- > From: Simon St.Laurent > Sent: 06 April 1999 15:50 > To: Buss, Jason A; 'xml-dev@ic.ac.uk' > Subject: Re: Webdeveloper article on MS-XML in Office 2000.... > > > At 07:23 AM 4/6/99 -0500, Buss, Jason A wrote: > >No one (that I can think > >of) WANTS Microsoft to implement their own 'flavor' of XML. > The whole point > >of XML (I gathered) was so that anyone's browser (or > application) could open > >and read ranting.xml without fear of not having the "right" > software to read > >the document. > > As far as I've seen (and I've avoided Office 2000 as much as possible) > MS-XML is just another XML-based format. I think they > combined the worst > of HTML with the worst of XML in the particular way they > implemented it, > but they have their own way of doing things. > > The hardest issue I see with MS-XML is convincing people that > there is more > to XML than this particular implementation. Getting them to move from > bloated documents full of extra junk to more streamlined XML documents > taking advantage of style sheets (not just style elements) > and meaningful > content structures will probably become more difficult as a > result of this > product. > > And that, I think, is exactly how certain companies want it. > > Simon St.Laurent > XML: A Primer > Sharing Bandwidth / Cookies > http://www.simonstl.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Apr 6 17:09:44 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:03 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: Elliotte Rusty Harold's message of Mon, 5 Apr 1999 10:06:39 -0500 Message-ID: <199904061509.QAA14986@stevenson.cogsci.ed.ac.uk> > However RXP > reports some warnings that do not appear to be errors, and missed some > problems involving the lack of encoding declarations in the text > declarations in an earlier version that xml4j 2.0.4 (but not 1.1.14) picked > up. These have now been fixed. Not complaining about the lack of encoding declaration is indeed a bug in RXP, unless it's a bug in the XML standard :-) I hadn't noticed that the encoding declaration is required in a text declaration although it's optional in an XML declaration. (Unlike the version information, which is the other way round.) Can someone tell me the rationale for this? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Apr 6 17:36:12 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:03 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... In-Reply-To: Message-ID: <199904061535.LAA15244@hesketh.net> At 04:09 PM 4/6/99 +0100, Mark Birbeck wrote: >Could look at it another way and say that now anyone can create a file >that could be edited or viewed by MS tools. For example, you could have >a database of information that is exported as XML and then pushed >through one stylesheet to make it viewable by MS PowerPoint, and another >to make it editable by MS Word. And of course there's no reason why you >couldn't use another stylesheet to edit the relevant data in a Lotus >spreadsheet. None of these applications need even exist on the web >server. (I thought open file formats and structures was one of the goals >of XML.) You certainly _can_ do it, but it'll be a pain in the neck, given the way they seem to have written their code. XSL might ease the pain of converting my 'meaningful' XML into their 'MS-XML' XML, but I don't think it'll be pretty or exciting. Microsoft seems to have asked "what's the easiest way to dump our application structures into XML?" rather than "how can we best represent this information in XML?" It's their choice, certainly - but it's also my choice not to be limited by their decision and to encourage others to look for better alternatives. >BTW, it would save a lot of time if someone came up with a stylesheet >that transforms everything into a conspiracy theory. I could then save >space on my hard-drive by just storing the original news item, and then >just use the stylesheet to 'view' the conspiracy theory when I needed >to. ;) Corporations are conspiracies to make a profit. Where's the surprise there? It's just that certain corporations have found ways to make profits that may not necessarily be to the benefit of the larger community. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Tue Apr 6 17:52:00 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:03 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... In-Reply-To: Message-ID: <001401be803f$8d1dfbc0$f27e8bcf@total.net> Hi Jason, I think that this was obvious since the beginning. XML is like SGML it let you create your _own_ language. So nothing refrain Corel to create its own language, Microsoft to do so, etc... HTML is different, it is an application or domain specific language. Even imperfect, it is more standard because the vocabulary is already created and defined. But XHTML opens some doors.... I could say however that even if Microsoft is proprietary, even if Coral XML could also be proprietary, it is improvement on what we got. Why? Now I can use a style or transformation language to convert, manipulate, display these documents. So, I can use DSSSL, XSL, omnimark, etc.. languages to manipulate the HTML/XML output into something else. This is, in my own view an opinion, an improvement from what we got. I never expected Microsoft to use a XML standard (is there any, anyway?) for its office documents. I am a too old monkey to believe in Santa Claus :-))) But, now, office document could be manipulated with style and transformation languages and this is good news to me because, we can now transform office documents into any desired document architecture and still benefit from the user friendly interface Office products have (and so do Corel products). I know by experience that it is hard to convince end users to use other tools than the ones they like. And they like the way word processors work. Guys, why don't we take this opportunity to show WebReview people that style and transformation language like DSSSL, PERL_XML, XSL could be used to obtain what we want. This would be more constructive than playing the national sport: Microsoft bashing, sit, and snip a beer and do nothing. Come on, this is opportunity in fact, we can show that going from previous format to HTML/XML is a new opportunity for text manipulation and transformation. But I know, this means, stop putting faults on someone else shoulder, stop drinking beer until the job is finished and finally _do_ something. A simple sentence: _just_ _do_ _it_! Anyway, thanks for the info, even if this info do not help us in any way. Except give more fuel to the government attorney. Don't worry, Microsoft cannot do anymore what they want now. But we have to use more our brains too. And I know criticism is easier than _doing_ things. I am playing a bit now to create dsssl scripts that transform a Word output document into a different document architecture. In the process, I learned a lot, especially that now, it is a lot more easier than when the format was word proprietary format. If Corel and Lotus do the same, it will be easier to convert a document created from one editor to an other. (PS: this is not intended to you Jason, you just awoken a sleeping monster). Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Buss, Jason A Sent: Tuesday, April 06, 1999 8:24 AM To: 'xml-dev@ic.ac.uk' Subject: Webdeveloper article on MS-XML in Office 2000.... Has anyone read the latest article on XML implementation in Office 2000? http://webreview.com/wr/pub/1999/04/02/edge/index.html?wwwrrr_19990402.txt What does the STANDARDIZATION of XML have to do with enforcing a software standard? If Lotus and Corel and all these other office suite vendors adopt XML as the standard reads, and Microsoft maims XML implementation so that an XML doc saved in Office 2000 can't be read by other vendors' applications, does it not become clear who is in the wrong here? No one (that I can think of) WANTS Microsoft to implement their own 'flavor' of XML. The whole point of XML (I gathered) was so that anyone's browser (or application) could open and read ranting.xml without fear of not having the "right" software to read the document. I have Office (I have no choice, since it's the "standard") but when I have to share something, If I can't fit it in Outlook (another "standard") then I attach a .txt file. I don't have to worry about anyone's application not supporting that. If it's too big, or needs graphics, then I post it on the web or intranet, in HTML 3.2 (no DHTML or stinkin' BLINK tags) and forget about it, cuz I know noone should have trouble reading it. Just like the guy in this article. Why not apply the same logic to his documents that he does to his webpages? Then he doesn't have to chastise anyone (like anyone using Linux) about not using Word. Ok, I feel a bit better now.... Jason A. Buss Single Engine Technical Publications Cessna Aircraft Co. jabuss@cessna.textron.com "I don't dislike Microsoft products... I still use Flight Simulator religiously..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Tue Apr 6 18:09:50 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:03 2004 Subject: SUMMARY: regexp lib for java xml development? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2550F6@MDYNYCMSX1> Many thanks to everyone who made suggestions. http://meurrens.ml.org/ip-Links/java/regex/ takes advantage of too many annoying web page tricks, but it does offer a comprehensive list of java regexp packages and is more up-to-date than http://home.worldcom.ch/~jmlugrin/RegExSummary.html (last updated 5/97), which I had been working from. The gnu package at http://www.cacas.org/~wes/java/ is small, lean, and I got it to work, so I'll probably continue with that. I originally mentioned that I couldn't get the oroinc one to work. According to the top of http://www.oroinc.com/downloads, their main distribution doesn't work with Java 1.2. The same page provides a link to a version that does. Several other packages out there build on this one. The evaluation copy of http://javaregex.com/ outputs a message about being an evaluation copy, and in my "lazy evaluation" mode, I was just redirecting my output to a file in which I didn't need that message. If I didn't see anything else I liked, my next planned step was to create a proper output stream for my program's output, at which point I wouldn't have this problem anymore. $10 certainly seems reasonable. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Apr 6 20:44:59 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:03 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> <370807CC.CC9F0E1D@w3.org> <3708314B.B98240BE@eng.sun.com> Message-ID: <370A2606.EB895110@w3.org> David Brownell wrote: > > > > > > > What this RFC appears to do is remove author control over correctly > > > > > > labelling the encoding, and ensure that most if not all XML documents > > > > > > get incorrectly labelled as US-ASCII. > > > > > > > > > > Not at all. The best default MIME content type for all web > > > > > servers is "application/xml". > > > > > > > > Why? Do you consider anything not written in US-ASCII as a text > > > > document? I think the Unicode Consortium would disagree with you there. > > > > > > No, and that's not what I said: > > > > But it is the implication of your argument. > > How could it imply that? Because you seemed to be advising not using text/xml for anything not in US-ASCII > I didn't even talk about what "text" is, > only about what MIME guarantees. And MIME only talks about what some > specific content/media type categories mean, not about what "text" is. But it talks about what text/* is .... > See RFC 2046 and the discussion in section 4.1.2 for further information. > It says eight bit or multibyte encoded "text/*" "MUST" use a "charset=..." > property, which you seem to dislike; perhaps you were unaware that MIME > has fundamental constraints in this area. MIME actually need not have those constraints; *email* has those constraints (although increasingly it does not, in practice). HTTP is always 8-bit clean. I agree that the MIME RFCs have steadfastly tried to pretend that MIME is an email-only thing. Individual text/whatever registrations can overide the generic methods of the text/* class, as for example the text/html registration does. > RFC 2376 is being compatible > with this fundamental Internet standard, which IMHO is the right idea. Whilst making it incompatible with the fundamental W3C Recommendation, which is IMHO the wrong idea. > > > For a single world-wide default; that's easily understood by overworked, > > > underpaid, often untrained sysadmins; and hence is NOT error prone (!!), > > > there's a simple answer that's guaranteed to work right everywhere that > > > pays more than lip service to industry standards, and hence is "best". > > > Namely, that servers report XML documents as "application/xml". > > That requires _no conclusions_ about what is or is not "text". > It only says that encoded text is most likely to be dealt with in > the correct way if people label XML text as "application/xml". You have asserted this several times but not actually demonstrated it. I pointed out that the fundemental constraint on correct handling is whether an application understands a particular encoding, not how that encoding labelling is transmitted (although a method that is persistent across local copies is preferable to one that is not). So "guaranteed to work right everywhere" is not, in fact, a guarantee at all. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Apr 6 20:45:30 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:04 2004 Subject: XML encoding labels and RFC2376 (was Re: IE5.0 does not conform to RFC2376) References: <001c01be7f8c$3c5bf100$4ef96d8c@NT.JELLIFFE.COM.AU> Message-ID: <370A32E2.C3404352@w3.org> Rick Jelliffe wrote: > > From: Chris Lilley > >In other words, automatic detection based on unlabelled content. > >I believe that this is a bad thing, because there is always the > >possibility (quite high) of getting it wrong. > > > >The encoding declaration, on the other hand, is not autodetection in > >that sense, it is a label. A very small amount of autodetection has to > >be done in order to be sure that the label has been read, that is all > >(ie, is this UTF-16 or is this an encoding where ASCII is represented > as ASCII). > > In academic material, this is called "codeset announcement" Yes, and it is also called a "designating sequence" in ISO-2022 but hey, I didn't want to go there. > The term "autodection" does give people the idea > that guessing is involved. Yes. That was the sense in which I was using it. I agree that this is a bad concept to promote. > This is important, because if developers think that autodetection > means guessing rather than codeset announcement, they may be tempted to > guess encodings without alerting users that something seems strange: Right, and this would be real bad. In contrast, the XML encoding declaration is a real declaration, just like Message-ID: <370A5B4A.E34F029@w3.org> Rick Jelliffe wrote: > > From: Chris Lilley > > >An alternative method for achieving the same result is to use a filter > >(this can be done in Apache and in Jigsaw) which automatically emits > the > >correct charset parameter based on reading the encoding declaration in > >the XML instance. > > I think this is the approach that, ultimately, we all are hoping will be > deployed. > > We are having a variant of this at our site: for when serving XHTML well, and good luck to you making it conform to all these requirements *and* be processed correctly (for a given value of correct) by todays HTML "implmentations". >one must > > * make sure the HTML meta tag is correct (for HTML conformance) > * make sure the XML encoding is correct (for XML conformance) > * make sure the MIME charset is correct (for MIME conformance) > > Three chances to get something wrong! (And not forgetting that some > HTML editors > push the metatags or PI around, so you may end up with duplicated tags > relating to > character set inside the document.) So, yes. one needs to be made the master and the others derivved from it, so that any derived duplicates can be deleted. > Given that an XML processor may transcode the document without knowing > the meanings of the elements (i.e., that the meta tag means something), > the XML encoding has to have priority over the HTML meta tag value. Yes > And > given that a proxies can transcode text/* files without knowing what > kind of text it is (i.e., that it is XML, and so has a label), the MIME > header has to have priority over the XML header PI. But, Given that documents can be stored locally (this is, I think, still the 99% case for document authoring for example), then one can equally show that the MIME charset parameter has to be derived from the XML encoding declaration. Alternatively, transcode away but remember to alter the encoding declaration. (This was my original proposal, although now I think that auto-generating the MIME data from the document is the best approach) > I think that is the > logical order: generic operations must be allowed. I guess I think that loading and saving documents is a generic process too. Quick question - how many transcoding proxies are on your current machine? How many on your server? Now, how many locally stored XML documents are on your current machine? How many on your server? Thought so. > However, it is all spoiled if there are systems which corrupt the > labels: for example by rewriting the charset parameter incorrectly. It > is far better to send the XML file without a charset parameter than to > send it with a wrong one. Yes, that was also my point - given that XML 1.0 Rec already has an excellent description of how to read the encoding declaration, and given that (as has been pointed out) it already has that machinery to deal with application/xml, then use that declaration as the primary label. For consistency and robustness, I would make the server also send that information again as a MIME charset parameter (in the case of text/xml). > >And if the charset parameter is present, then it should say the same > >thing as the encoding declaration. > > The best way to ensure this is to > >treat the XML encoding declaration as the prmary metadata resource and > >to programatically derive the charset parameter from this; greater > >robustness is at once achieved and also harmonisation of the MIME and > >XML labelling. > > Yes, with the exception that the XML encoding PI could itself be derived > from internal data (e.g. in XHTML). Well, I will declare that one out of scope except to note a problem - if there are multiple incompatible META tags - a problem you pointed out - then which of them do you use? > The problem isn't really "which should have precedence?" (because all > systems will break somewhere, given the state of webservers and current > awareness) as much as it is "how can move the Web towards safety and > interoperability, where the markup now available at each stage is made > available to the next?" I think the current XML media types for MIME > gives the appropriate policy for charset preference (transcodability is > one property of text/* which application/* must not have), but, as > Chris is pointing out, mechanisms to set the MIME charset parameter from > XML (and to overwrite it on delivery too) have yet to be put into > place. I am working on that. Jigsaw is a nice system to prototype on, and Apache is a good target given its >50% market share. > >However, I will point out that it is the consensus of the XML 1.0 > >Recommendation that I am respecting - and that the RFC does not, by > >altering the meaning of the default encoding. It could have been > >harmionised with the XML REC; it was not. > > I think the XML SIG and WG pretty much all had concensus on the RFC at > the end, in full knowledge of XML 1.0. But I think many of us came out > of it thinking that it is safer to use application/xml. So, the text/xml registration could have said "do not use this media type for the following reasons". > In particular, I think that a mismatch between the XML encoding > declaration and the MIME charset (and the XHTML met tag) should some > kind of weak Reportable User Error: On the contrary, I think it should be a fatal error. You can't parse the document if you don't know whata character is. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Tue Apr 6 21:10:16 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:04 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <004001be7f81$9cfce240$710215ac@opentext.com> References: Message-ID: At 12:30 PM -0400 4/5/99, Kar Yan Ng wrote: >Hello, it seems that the files refers to a bunch of external >dtd's. Could u put these into a zip file so that we don't >have to download all the dtd's one by one. > Such a file is now available from ftp://ftp.metalab.unc.edu/pub/languages/java/javafaq/players.tar.gz I've fixed the problem with the non-deterministic content model. I'll be updating that soon with a new file that actually uses all 1000+ entities, as opposed to the current version which defines 1000+ entities but only uses one. I'll also fix the problems with multiple definitions of some entities. After my most recent changes, I now feel reasonably confident that this file is in fact valid. xmlproc and Java Project X have both been reported as validating it. IBM's xml4j trips over the relative URLs used in entity references. I stil don't know what IE5 and the DCXML parsers are tripping over, but there's definitely something. One parser failed simply because of the number of entities, and the resulting overflow of Solaris's maximum number of file descriptors per process. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Tue Apr 6 21:24:05 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:11:04 2004 Subject: Webdeveloper article on MS-XML in Office 2000.... Message-ID: <000601be8062$a19e2980$46026982@thing1> >Corporations are conspiracies to make a profit. Where's the surprise >there? It's just that certain corporations have found ways to make profits >that may not necessarily be to the benefit of the larger community. Sounds like we need a new business model. Or a smarter community. Hey, isn't that what the web is suppose to enable??? :-) Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mike at DataChannel.com Tue Apr 6 21:54:37 1999 From: mike at DataChannel.com (Mike Dierken) Date: Mon Jun 7 17:11:04 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <8EAE75D3D142D211A45200A0C99B60236FD18C@ZEUS> The latest internal version of DCXML (from DataChannel) correctly handles this file and its external entities. The beta 2 version on our web site may have problems, I don't know one way or another. Mike D DataChannel -----Original Message----- From: Elliotte Rusty Harold [mailto:elharo@metalab.unc.edu] Sent: Tuesday, April 06, 1999 12:07 PM To: xml-dev@ic.ac.uk Subject: Re: XML Torture Test: Parsers Fail At 12:30 PM -0400 4/5/99, Kar Yan Ng wrote: >Hello, it seems that the files refers to a bunch of external >dtd's. Could u put these into a zip file so that we don't >have to download all the dtd's one by one. > Such a file is now available from ftp://ftp.metalab.unc.edu/pub/languages/java/javafaq/players.tar.gz I've fixed the problem with the non-deterministic content model. I'll be updating that soon with a new file that actually uses all 1000+ entities, as opposed to the current version which defines 1000+ entities but only uses one. I'll also fix the problems with multiple definitions of some entities. After my most recent changes, I now feel reasonably confident that this file is in fact valid. xmlproc and Java Project X have both been reported as validating it. IBM's xml4j trips over the relative URLs used in entity references. I stil don't know what IE5 and the DCXML parsers are tripping over, but there's definitely something. One parser failed simply because of the number of entities, and the resulting overflow of Solaris's maximum number of file descriptors per process. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Tue Apr 6 22:00:03 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:04 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <8725674B.006DB985.00@d53mta06h.boulder.ibm.com> Elliotte Rusty Harold wrote: > > >>Without intending to do so, I have devised an XML document that exposes > >>many problems in almost all XML validating parsers and non-validating > >>parsers that resolve external entity references. You will find this > >>torture test at > > http://metalab.unc.edu/xml/examples/players/index.xml I tried recreating the failures of Elliote's torture test using XML4J 2.0.4 and was not able to produce a failure. I tried the validating SAX, validating DOM, and TX-compatibility parsers and they all responded with no errors or warnings. After having done some additional "validity" checks in our current internal build of the parser, the XML4J parser now reports the following: [Warning] index.xml:7:3: Element, "PLAYER", refers to undeclared element, "SV", in content model [Error] AllenWatson.xml:10:13: Element, "SV" is not declared in the DTD These additional checks will be in the next release of the parser. -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Apr 6 23:01:08 1999 From: richard at goon.stg.brown.edu (Richard Goerwitz) Date: Mon Jun 7 17:11:04 2004 Subject: XML Torture Test: Parsers Fail References: <8725674B.006DB985.00@d53mta06h.boulder.ibm.com> Message-ID: <370A75A8.C6DE98F8@goon.stg.brown.edu> andyclar@us.ibm.com wrote: > I tried recreating the failures of Elliote's torture test using > XML4J 2.0.4 and was not able to produce a failure. Although it's not true in your case, many failures may have nothing to do with the parsers per se, but rather with system resources. Validating ERH's files may require opening enough file descriptors that either the process hits its own limit - or the system-wide maximum gets reached. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Tue Apr 6 23:45:44 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:04 2004 Subject: Refactoring SAX 1.0 Message-ID: <8725674B.0077674A.00@d53mta06h.boulder.ibm.com> In working on the IBM XML Parser in Java, we have found that a number of classes from the SAX 1.0 API were useful to all of our XML parsers, regardless of the higher level parser API. For example, entity resolution, error handling, and parse location are common to all XML parsers. Even parsing a document from an input source is the same. We think that it would be extremely beneficial to XML parser users to break out the common classes into a separate package. In this way, users have consistent access to the XML parser and basic parser features even when the results of parsing are different. For example, parsing and error handling are common whether the result of the parse is an event stream, tree structure, JavaBean, transaction process, etc. Since there is ongoing work to define the SAX2 interfaces, it seems that now would be the perfect time to address whether refactoring the SAX 1.0 classes makes sense. Have other users encountered these issues and is there interest in refactoring the SAX 1.0 classes in order to take advantage of the commonality? -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Tue Apr 6 23:48:50 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:11:05 2004 Subject: Metastructures 1999 Message-ID: <199904062147.QAA01662@bruno.techno.com> GCA Metastructures 1999 Call for Paper and Tutorial Proposals August 17-18, 1999 Co-chairs: Carla Corkern and Steve Newcomb The Graphic Communications Association (GCA) invites you to participate in the sixth annual Metastructures conference, which is one event in a set that includes the perennial XML Developers' Conference and various OASIS functions. It's our community's relaxed, summertime opportunity to get away and talk plainly about things that really matter to information owners and users, and their advisers and system integrators. Paper and tutorial proposals are due on or before Friday, May 21, 1999. They should be addressed to meta99@gca.org For more information: http://www.gca.org/conf/meta99 -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 00:56:34 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail References: <2F2DC5CE035DD1118C8E00805FFE354C0F3626A9@RED-MSG-56> Message-ID: <370A8FF1.5EB0A707@eng.sun.com> Chris, these aren't errors ... unless there are references to those entities (&baseball; and &season;) in the document, which is not currently done. If IE5 is treating those as errors, it shouldn't. - Dave Chris Lovett wrote: > > The problem appears to be in braves.dtd. You have the following: > > > > > and these DTD's exist - so you have general parsed entities pointing to DTD > information which is not right. > > Once these two lines are removed from braves.dtd everything loads fine in > IE5. > > -----Original Message----- > From: David Brownell [mailto:db@eng.sun.com] > Sent: Monday, April 05, 1999 11:27 AM > To: Elliotte Rusty Harold > Cc: xml-dev@ic.ac.uk > Subject: Re: XML Torture Test: Parsers Fail > > For some reason I've seen three followups to this note, but > not the original note ... > > Elliotte Rusty Harold wrote: > > > > >>Without intending to do so, I have devised an XML document that exposes > > >>many problems in almost all XML validating parsers and non-validating > > >>parsers that resolve external entity references. You will find this > > >>torture test at > > > > http://metalab.unc.edu/xml/examples/players/index.xml > > I ran Sun's parsers and they warned of a variety of redefined > entities ... for example, "&AaronSmall;" defined in two > files, both "athletics.dtd" and "diamondbacks.dtd". (See > the list below.) > > I did notice that this relies on correct interpretation of > relative URIs, which I know have been handled incorrectly > by at least two other "validating" parsers (perhaps not in > their current releases though). > > - Dave > > ** Warning, line 1, uri > http://metalab.unc.edu/xml/examples/players/diamondbacks.dtd > Using original entity definition for "&AaronSmall;". > ** Warning, line 21, uri > http://metalab.unc.edu/xml/examples/players/indians.dtd > Using original entity definition for "&JimPoole;". > ** Warning, line 16, uri > http://metalab.unc.edu/xml/examples/players/mariners.dtd > Using original entity definition for "&GlenallenHill;". > ** Warning, line 1, uri > http://metalab.unc.edu/xml/examples/players/marlins.dtd > Using original entity definition for "&AlexGonzalez;". > ** Warning, line 20, uri > http://metalab.unc.edu/xml/examples/players/padres.dtd > Using original entity definition for "&KevinBrown;". > ** Warning, line 11, uri > http://metalab.unc.edu/xml/examples/players/pirates.dtd > Using original entity definition for "&FreddyGarcia;". > ** Warning, line 9, uri http://metalab.unc.edu/xml/examples/players/reds.dtd > Using original entity definition for "&DennisReyes;". > ** Warning, line 2, uri > http://metalab.unc.edu/xml/examples/players/rockies.dtd > Using original entity definition for "&BobbyJones;". > ** Warning, line 33, uri > http://metalab.unc.edu/xml/examples/players/tigers.dtd > Using original entity definition for "&ScottSanders;". > ** Warning, line 22, uri > http://metalab.unc.edu/xml/examples/players/whitesox.dtd > Using original entity definition for "&MarkJohnson;". > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 01:34:24 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:05 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> <370807CC.CC9F0E1D@w3.org> <3708314B.B98240BE@eng.sun.com> <370A2606.EB895110@w3.org> Message-ID: <370A98D5.F2BFF4CF@eng.sun.com> Chris Lilley wrote: > > > See RFC 2046 and the discussion in section 4.1.2 for further information. > > It says eight bit or multibyte encoded "text/*" "MUST" use a "charset=..." > > property, which you seem to dislike; perhaps you were unaware that MIME > > has fundamental constraints in this area. > > MIME actually need not have those constraints; Debatable, though beside the point: it _does_ have them, and always has. Did so since before the web existed. > Individual text/whatever registrations can overide the generic methods > of the text/* class, as for example the text/html registration does. No, they need to be compatible with "text/plain" per the RFCs. Remember that old agents must be able to handle new MIME types, and if they do not understand the subtype ("whatever") they're allowed to treat all "text/whatever" like "text/plain". For example, by preserving only seven bits per character when there's no "charset=..." property. > > RFC 2376 is being compatible > > with this fundamental Internet standard, which IMHO is the right idea. > > Whilst making it incompatible with the fundamental W3C Recommendation, > which is IMHO the wrong idea. You've got it backwards: MIME predated HTTP and HTML (and W3C!) by a number of years, so it wasn't MIME which caused the confusion underlying this overlong discussion! The only incompatibility I know about is what someone wrote about what it means to transmit "text/html" over one specific transport medium (HTTP). > > > > For a single world-wide default; that's easily understood by overworked, > > > > underpaid, often untrained sysadmins; and hence is NOT error prone (!!), > > > > there's a simple answer that's guaranteed to work right everywhere that > > > > pays more than lip service to industry standards, and hence is "best". > > > > Namely, that servers report XML documents as "application/xml". > > > > That requires _no conclusions_ about what is or is not "text". > > It only says that encoded text is most likely to be dealt with in > > the correct way if people label XML text as "application/xml". For example, even if people remain confused by what the "text/*" registration means, there's much less confusion about what other ones mean, and you're less likely to see critical bits stripped out of such messages because the originator (server, etc) didn't label the type correctly. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 01:53:49 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:05 2004 Subject: Refactoring SAX 1.0 References: <8725674B.0077674A.00@d53mta06h.boulder.ibm.com> Message-ID: <370A9D62.BF04ACFD@eng.sun.com> There aren't that many classes in SAX 1.0, and they can be used as-is without "refactoring" anything at all. And, importantly, without sacrificing compatibility. Or am I missing something in what you're suggesting? - Dave andyclar@us.ibm.com wrote: > > In working on the IBM XML Parser in Java, we have found that > a number of classes from the SAX 1.0 API were useful to all > of our XML parsers, regardless of the higher level parser > API. For example, entity resolution, error handling, and > parse location are common to all XML parsers. Even parsing > a document from an input source is the same. > > We think that it would be extremely beneficial to XML parser > users to break out the common classes into a separate package. > In this way, users have consistent access to the XML parser > and basic parser features even when the results of parsing > are different. For example, parsing and error handling are > common whether the result of the parse is an event stream, > tree structure, JavaBean, transaction process, etc. > > Since there is ongoing work to define the SAX2 interfaces, > it seems that now would be the perfect time to address whether > refactoring the SAX 1.0 classes makes sense. > > Have other users encountered these issues and is there > interest in refactoring the SAX 1.0 classes in order to take > advantage of the commonality? > > -- > Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Apr 7 01:56:00 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:11:05 2004 Subject: Metastructures 1999 In-Reply-To: <14090.34773.868152.951396@localhost.localdomain> (message from David Megginson on Tue, 6 Apr 1999 18:17:21 -0400 (EDT)) References: <199904062147.QAA01662@bruno.techno.com> <14090.34773.868152.951396@localhost.localdomain> Message-ID: <199904062354.SAA01846@bruno.techno.com> GCA Metastructures 1999 Call for Paper and Tutorial Proposals August 17-18, 1999 I forgot to mention: it's in Montreal, Quebec, Canada again this year. Same hotel and everything. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 02:00:28 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:05 2004 Subject: Attribute-Value Normalisation References: <041201be7b91$7fd0b050$2096c9c2@mik-ppro.owl.co.uk> <00d101be7b9d$e63ea660$0300000a@cygnus.uwa.edu.au> Message-ID: <370A9EEB.DE7AE90B@eng.sun.com> James Tauber wrote: > > > > [...] > > > > > > Will this attribute be normalised if it contains any whitespace? > > If it is declared CDATA, the only whitespace normalization is that > carriage-returns, line-feeds and tabs are normalized to spaces (with a CR+LF > being normalized to only a single space) > > Only if it were *not* CDATA would multiple spaces be normalized to one. Also, an issue that matters to some folk, if you _need_ to have some whitespace (notably CR, LF, and TAB) not be normalized, you may put it in as a character reference. You _can_ have attribute values that have whitespace, but you need to work a bit at it ... :-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From colds at nwlink.com Wed Apr 7 03:52:14 1999 From: colds at nwlink.com (Chris Olds) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> I'm not so sure that IE5 is wrong in reporting an error (when unreferenced General Entities are DTD chunks). The XML REC says (in 4.3.2 "Well-Formed Parsed Entities") "An external general parsed entity is well-formed if it matches the production labeled extParsedEnt", which is an optional TextDecl [77] followed by 'content' [43]. Non-validating processors are not required to read external entities, but they are not forbidden to read them if they are not referenced. While I don't think this is necessarily the best choice, I think it is just that - an implementation choice. What have I missed? /cco xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Apr 7 04:10:43 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:05 2004 Subject: About the DOM API, Message-ID: <000501be8095$63637f20$c2d42fd1@total.net> Hi, In our quest to find usable way to interface module with DOM interface, here is what we found: On the Linux platform, The project having the probability to have a certain success is OrBit (From Red Hat Labs). The project has enough fundings and actually progressing quite well. Also, the advantage is that the whole GTK+ project is based on CORBA objects implemented with OrBit IDL compiler and have therefore the same binary interface. If GNOME has momentum on the Linux platform we can say that OrBit will the the middleware of choice on Linux. Thus, we found that a C++ implementation that would work on the Linux platform could be implemented from Orbit generated code. Also, if Mozilla XPCOM get up to speed, the other alternative is XPCOM. XPCOM is very similar to DCOM and can with minimum amount of work be derived from COM objects. The modifications being the C function to add to the project to get the class factory. Everything else is the same and therefore it is not expensive to support both XPCOM and COM. XPCOM could be very interesting for OS2 platform support. What now remains is the Mac platform. Does any people with Mac knowledge could help? Conclusion: On Linux, A Orbit CORBA based implementation has more chance to work with other modules because Orbit is the foundation for GTK+ itself the GNOME glue. All Orbit modules have the same binary signature and can communicate together with very low overhead (near function call speed). On Windows a COM/XPCOM interface with also a common binary signature for the DOM interface. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Apr 7 05:37:32 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <8725674B.006DB985.00@d53mta06h.boulder.ibm.com> Message-ID: At 1:58 PM -0600 4/6/99, andyclar@us.ibm.com wrote: >Elliotte Rusty Harold wrote: >> >> >>Without intending to do so, I have devised an XML document that exposes >> >>many problems in almost all XML validating parsers and non-validating >> >>parsers that resolve external entity references. You will find this >> >>torture test at >> >> http://metalab.unc.edu/xml/examples/players/index.xml > >I tried recreating the failures of Elliote's torture test using >XML4J 2.0.4 and was not able to produce a failure. I tried the >validating SAX, validating DOM, and TX-compatibility parsers >and they all responded with no errors or warnings. > >After having done some additional "validity" checks in our >current internal build of the parser, the XML4J parser now >reports the following: > > [Warning] index.xml:7:3: Element, "PLAYER", refers to undeclared element, > "SV", in content model > [Error] AllenWatson.xml:10:13: Element, "SV" is not declared in the DTD > I've fixed the undeclared SV element in the latest version. I introduced that problem while fixing the non-deterministic content model problem yesterday. XJParse 1.1.14 can now handle this document from the Web site but not when the files are loaded from the local hard drive where it still reports AllenWatson.xml not found. XJParse 2.0.4 seems to work from both the local hard drive and the Web site, but it's hard to tell since it doesn't report as much as XJParse 1.1.14. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Apr 7 05:43:32 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <370A8FF1.5EB0A707@eng.sun.com> References: <2F2DC5CE035DD1118C8E00805FFE354C0F3626A9@RED-MSG-56> Message-ID: At 3:51 PM -0700 4/6/99, David Brownell wrote: >Chris, these aren't errors ... unless there are references >to those entities (&baseball; and &season;) in the document, >which is not currently done. > >If IE5 is treating those as errors, it shouldn't. > >- Dave > > >Chris Lovett wrote: >> >> The problem appears to be in braves.dtd. You have the following: >> >> >> >> >> and these DTD's exist - so you have general parsed entities pointing to DTD >> information which is not right. >> >> Once these two lines are removed from braves.dtd everything loads fine in >> IE5. >> That does seem to be the problem. Once I fixed that, IE 5.0 could load the document from my local hard drive, but it still failed to load it from the Web site. I don't yet know why. I think what this whole mess is showing, given the widely varying problems with so many parsers, is that validation is not nearly as simple as it seems, especially when the validators are asked to handle large files. A couple of decades ago a lot of bugs were exposed in various compilers for various languages when the output of various program generators like lex and yacc were thrown at them. While these compilers could handle anything a human programmer was likely to write, they failed when faced with automatically generated code. The compilers made too many assumptions about what code looked like that weren't part of the language specs. I suspect we're seeing something like that here. These files and the DTDs containing the entity references were all created by a program that pulled data out of a database. Only the basic structure of the document was designed by hand. Pouring a database into a custom designed XML vocabulary is not unusual, but programmatically creating the entity references does seem to be unusual. I worry about what's going to happen when we start writing programs that not only generate the data and entity references but also the vocabulary. We're likely to uncover even more bugs and underlying assumptions about what XML files look like. This one document uncovered verifiable, repeatable problems in four separate independently developed parsers. What's interesting is that these were four completely different problems. We may be able to learn something from the more formal, verifiable approach to compiler design that's taken hold over the last 20 years. We need to think about a more formal specification of XML, and perhaps provably correct parsers. At the very least there needs to be more connection between the spec and validating parsers. The BNF grammar is straight-forward (though at least one parser doesn't seem to be relying on it) but the validity constraints are a mess. The various schema proposals may present an opportunity to fix this. We should consider very carefully whether a given schema grammar can be easily (preferably autamtically) translated into a parser for schemas based on the grammar and documents based on particular schemas. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Apr 7 07:40:53 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:11:05 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <370A0194.C175B421@w3.org> Message-ID: <199904070539.AA00211@archlute.apsdc.ksp.fujixerox.co.jp> Chris, > It's good to see a concrete proposal. On the other hand, relying on a > complex convention of filename suffixes is problematic: I understand your concern. However, Uchida-san's proposal is not an attempt to use convention instead of the charset parameter. It is intended to help to provide the correct charset parameter. I agree that there are some side-effects which some people might oppose to. > An alternative method for achieving the same result is to use a filter > (this can be done in Apache and in Jigsaw) which automatically emits the > correct charset parameter based on reading the encoding declaration in > the XML instance. This can easily cache its results, and need not > result in processing overhead on each request. I strongly agree. This is the best approach. I sincerely hope that such an attempt will happen at W3C. > > At *IETF*, the default of the charset parameter for text/HTML *is* 8859-1. > > Yes, which is different to the default for text/* - this demonstrates > that it is possible to give a more specific rule for a particular > registration. Actually, in the case of HTTP MIME, the default of the charset parameter of text/* is always ISO-8859-1. In the case of real MIME, the default of the charset parameter of text/* is always US-ASCII. text/html is not an exception. text/xml is an exception, since the default is always US-ASCII. This was recommended by ISEG. > > It is going to be very difficult or > > impossible, since HTTP and MIME people will disagree. > > I think you mean, HTTP and Mail(SMTP/IMAP/POP). MIME is used by both > email and HTTP. HTTP MIME is not quite the same as real MIME. There are many differences between the two. > > There have been a lot of discussion about this issue. None of your arguments > > are new to me. In fact, my original opinion was not so different from yours but > > I have changed my mind during the discussion. More about this, see the archive > > of the XML SIG (around April and May of 1998). > > OK, I will check this out. I cannot of course discuss such material in > this forum, however. Perhaps you could post your technical reasons for > the change of direction here? text/xml has to be consistent with HTTP and MIME. Autodetection or the use of META tags as the default of the charset parameter has been extensively discussed by HTTP people and MIME people. They strongly dissent. > But, if it is not present, > then the XML Rec says exactly what should happen; Appendix F is non-normative. RFC2376 supercedes it, as intended by the XML WG. XML 1.0 cleary says: "Rules for the relative priority of the internal label and the MIME-type label in an external header, for example, should be part of the RFC document defining the text/xml and application/xml MIME types. ... in particular, when the MIME types text/xml and application/xml are defined, the recommendations of the relevant RFC will supersede these rules." By the way, now that RFC 2376 is publisehd, XML 1.0 will be revised. >carefull wording which > this RFC nullifies. Problems arise if an XML file is saved from the Web > to a local filesystem, perhaps for further editing; the MIME charset > information is lost. It could perhaps be stored in some way - but, there > is already a standard way - the XML encoding declaration. Since it is a standard way, RFC 2376 recommends recipient programs to rewrite encoding declarations. > And if the charset parameter is present, then it should say the same > thing as the encoding declaration. This disallows code conversion by proxy servers. One could argue that proxy servers should rewrite encoding declarations. However, documents should not be rewritten for security reasons. Moreover, if we require different code conversion for different subtypes of text, there is not much hope for interoperability, especially because fallback to text/plain is required. > The best way to ensure this is to > treat the XML encoding declaration as the prmary metadata resource and > to programatically derive the charset parameter from this; greater If it is done when the document is stored in the WWW server, that is superb. > However, I will point out that it is the consensus of the XML 1.0 > Recommendation that I am respecting - and that the RFC does not, by > altering the meaning of the default encoding. It could have been > harmionised with the XML REC; it was not. RFC 2376 IS the consensus (it was not unanimous, though). It is based on really extensive discussion at the XML SIG and XML WG. My mail folder named text/xml has 687 e-mails ;-( Larry Masinter (the HTTP WG chair) and Martin Duerst (the I18N IG chair) was heavily involved. On the other hand, appendix in XML 1.0 is merely informative and was meant to be replaced by the XML media type RFC. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Apr 7 08:20:20 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F3626A9@RED-MSG-56> The problem appears to be in braves.dtd. You have the following: and these DTD's exist - so you have general parsed entities pointing to DTD information which is not right. Once these two lines are removed from braves.dtd everything loads fine in IE5. -----Original Message----- From: David Brownell [mailto:db@eng.sun.com] Sent: Monday, April 05, 1999 11:27 AM To: Elliotte Rusty Harold Cc: xml-dev@ic.ac.uk Subject: Re: XML Torture Test: Parsers Fail For some reason I've seen three followups to this note, but not the original note ... Elliotte Rusty Harold wrote: > > >>Without intending to do so, I have devised an XML document that exposes > >>many problems in almost all XML validating parsers and non-validating > >>parsers that resolve external entity references. You will find this > >>torture test at > > http://metalab.unc.edu/xml/examples/players/index.xml I ran Sun's parsers and they warned of a variety of redefined entities ... for example, "&AaronSmall;" defined in two files, both "athletics.dtd" and "diamondbacks.dtd". (See the list below.) I did notice that this relies on correct interpretation of relative URIs, which I know have been handled incorrectly by at least two other "validating" parsers (perhaps not in their current releases though). - Dave ** Warning, line 1, uri http://metalab.unc.edu/xml/examples/players/diamondbacks.dtd Using original entity definition for "&AaronSmall;". ** Warning, line 21, uri http://metalab.unc.edu/xml/examples/players/indians.dtd Using original entity definition for "&JimPoole;". ** Warning, line 16, uri http://metalab.unc.edu/xml/examples/players/mariners.dtd Using original entity definition for "&GlenallenHill;". ** Warning, line 1, uri http://metalab.unc.edu/xml/examples/players/marlins.dtd Using original entity definition for "&AlexGonzalez;". ** Warning, line 20, uri http://metalab.unc.edu/xml/examples/players/padres.dtd Using original entity definition for "&KevinBrown;". ** Warning, line 11, uri http://metalab.unc.edu/xml/examples/players/pirates.dtd Using original entity definition for "&FreddyGarcia;". ** Warning, line 9, uri http://metalab.unc.edu/xml/examples/players/reds.dtd Using original entity definition for "&DennisReyes;". ** Warning, line 2, uri http://metalab.unc.edu/xml/examples/players/rockies.dtd Using original entity definition for "&BobbyJones;". ** Warning, line 33, uri http://metalab.unc.edu/xml/examples/players/tigers.dtd Using original entity definition for "&ScottSanders;". ** Warning, line 22, uri http://metalab.unc.edu/xml/examples/players/whitesox.dtd Using original entity definition for "&MarkJohnson;". xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Apr 7 15:37:51 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:11:05 2004 Subject: XML Torture Test: Parsers Fail References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> Message-ID: <370B5FA2.CCDE2159@goon.stg.brown.edu> Chris Olds wrote: > > I'm not so sure that IE5 is wrong in reporting an error (when unreferenced > General Entities are DTD chunks). The XML REC says (in 4.3.2 "Well-Formed > Parsed Entities") > >> "An external general parsed entity is well-formed if it matches the >> production labeled extParsedEnt", which is an optional TextDecl [77] >> followed by 'content' [43]. Non-validating processors are not required to >> read external entities, but they are not forbidden to read them if they are >> not referenced. > > While I don't think this is necessarily the best choice, I think it is just > that - an implementation choice. I don't see anything in the spec that says "don't read and validate external parsed entities if they're not used." And in fact, the spec seems to say that, in order to be valid, they must (whether used or not) match certain productions in the grammar. Someone please correct me if there is explicit language to the contrary. My feeling is that this is another one of those cases where the XML spec is typically interpreted in terms of SGML practice, but where nothing in the XML spec itself actually mandates such interpretation. And in fact, one can make a good case for reading in and checking exter- nal entities even if they're not "used." If you fail to do this, you can end up with a DTD that itself triggers errors when used with some docu- ments, but not others. Worse yet, you can end up with a DTD that was thought to be valid, but which fails unexpectedly when used with a new document instance. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Wed Apr 7 16:02:22 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: Richard L. Goerwitz's message of Wed, 07 Apr 1999 09:37:38 -0400 Message-ID: <199904071401.PAA26709@stevenson.cogsci.ed.ac.uk> > I don't see anything in the spec that says "don't read and validate > external parsed entities if they're not used." And in fact, the spec > seems to say that, in order to be valid, they must (whether used or not) > match certain productions in the grammar. Surely the question a validating parser is supposed to answer is not whether the external parsed entities are valid, but whether the *document* is valid. The definition of "valid" is (section 2.8): An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it. Presumably this implies well-formedness since otherwise we wouldn't have an "XML document". I don't think it is reasonable to say that external parsed entities defined but not referred to are part of the document. I think section 5.1 supports this view when it says "validating processors must read and process the entire DTD and all external parsed entities referenced in the document". Though it is clearly useful to be able to check that all the entities mentioned in a DTD are valid, I don't think it should be the default behaviour of a validating parser. For example, there might be entities that only make sense when other entities are included or excluded. (A separate point: it seems to me that the definition of "valid" is incomplete, since there are validity constraints other than "complying with the constraints expressed in the DTD"; section 5.1 mentions this.) -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Apr 7 16:11:08 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> Message-ID: >I'm not so sure that IE5 is wrong in reporting an error (when unreferenced >General Entities are DTD chunks). The XML REC says (in 4.3.2 "Well-Formed >Parsed Entities") >"An external general parsed entity is well-formed if it matches the >production labeled extParsedEnt", which is an optional TextDecl [77] >followed by 'content' [43]. Non-validating processors are not required to >read external entities, but they are not forbidden to read them if they are >not referenced. > I agree that IE5 can read the external entity if it feels like. However, the document is still well-formed because the entity is never referenced and is not part of the document. This document meets the criterion for well-formedness in Section 2.1; i.e. 1. Taken as a whole, it matches the production labeled document. 2. It meets all the well-formedness constraints. 3. Each of the parsed entities which is referenced directly or indirectly within the document is well-formed. #3 is the kicker here. The non-well-formed entity that causes the problem is never referenced. I'm not sure what indirectly referenced means though. Perhaps that provides some wiggle room. The only other releavnt instance of "indirect" I see in the spec is in the No Recursion well-formedness constraint in Section 4.1. This states that "A parsed entity must not contain a recursive reference to itself, either directly or indirectly" In this context an indirect reference seems to mean one that did not occur in the main document but that appears in one of the other external parsed entities that was included by a different entity reference.The annotated spec seems to support this interpretation though the example given uses purely internal entities. The word "indirect" also appears in these well-formedness constraints: Well-Formedness Constraint: No External Entity References Attribute values cannot contain direct or indirect entity references to external entities. Well-Formedness Constraint: No < in Attribute Values The replacement text of any entity referred to directly or indirectly in an attribute value (other than "<") must not contain a <. The annotated spec doesn't really address these two constraints in this way. It seems remotely possible that what's really meant is an unparsed entity, but if that's so why didn't the authors just say that? Furthermore, an unparsed entity has no reason not to contain these things. Again it seems that what is mean is simply an entity reference whose value uses another entity reference that violates the constraint. In short, I think IE5 is definitely incorrect in not accepting a declaration of a malformed entity in the absence of an actual reference to that entity. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Apr 7 16:29:28 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <370B5FA2.CCDE2159@goon.stg.brown.edu> References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> Message-ID: At 9:37 AM -0400 4/7/99, Richard L. Goerwitz wrote: >And in fact, one can make a good case for reading in and checking exter- >nal entities even if they're not "used." If you fail to do this, you can >end up with a DTD that itself triggers errors when used with some docu- >ments, but not others. Worse yet, you can end up with a DTD that was >thought to be valid, but which fails unexpectedly when used with a new >document instance. > But it's not DTDs that are well-formed or valid. It's documents. Nowhere in the XML specification is validity or well-formedness of a DTD, separate from a document, defined. And it's certainly true that whatever the DTD, you can attach documents to it that are non-well-formed and/or invalid (and which a parser will report is such). In other words, you can't validate a DTD alone, so there's no point in worrying about what happens if you try. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 16:32:01 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <370B5FA2.CCDE2159@goon.stg.brown.edu> References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> Message-ID: <14091.27586.410933.846017@localhost.localdomain> Richard L. Goerwitz writes: > I don't see anything in the spec that says "don't read and validate > external parsed entities if they're not used." And in fact, the spec > seems to say that, in order to be valid, they must (whether used or not) > match certain productions in the grammar. You could check them for well-formedness (I guess), but you could not validate them out of context -- the contents of an external parsed general entity might be valid at one reference point and invalid at another. For external parsed parameter entities, again, you could check that the declarations are well-formed, but you cannot do much else out of context. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Apr 7 16:56:47 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:06 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <004201be80fe$cf79d830$4ff96d8c@NT.JELLIFFE.COM.AU> From: MURATA Makoto >Appendix F is non-normative. RFC2376 supercedes it, as intended by the >XML WG. XML 1.0 cleary says: > > "Rules for the relative priority of the internal label and the MIME-type > label in an external header, for example, should be part of the RFC document > defining the text/xml and application/xml MIME types. ... in particular, > when the MIME types text/xml and application/xml are defined, the recommendations > of the relevant RFC will supersede these rules." > On the other hand, appendix in XML 1.0 is merely informative and was meant >to be replaced by the XML media type RFC. As far as making the relative priorities explicit. I hope there is no intention to supercede the XML encoding declaration as the normative way in which documents, not being wrapped by some higher protocol which treats the text at some more generic level, announce their character set. I think the encoding declaration has been a great success: witness the hundreds of character sets that is successfully supports. The only problems I have heard so far are: * some people say that UTF-7 cannot be accomodated (I have not confirmed this is true); * there are several 16-bit coded Unicode varieties, and so even 16-bit Unicode will need an encoding declaration: the BOM is enough for endian and width detection, but does not give information about whether surrogates are used; * the early (mandatory) normalization suggested in the W3C Character Model draft means that it is possible that even Unicode has two flavours (unnormalized and normalized): the intention is that this difference in repertoire should not be reflected in any header (everyone should just normalize, and if you don't, things will break...hmmm) * there is no way to support non-standard character sets (very commonly used here in Asia) (I have a little proposal floated called "DrLove" at http://www.ascc.net/~ricko/drlove.htm, which addresses this issue a little bit: comments welcome on Document Resource Locations suggestion.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From timm at channelpoint.com Wed Apr 7 17:11:03 1999 From: timm at channelpoint.com (Tim McCune) Date: Mon Jun 7 17:11:06 2004 Subject: Reporting language? Message-ID: <8A24EC12044FD21195E200600895E0B301636433@goat.channelpoint.com> I'm looking for an XML schema that can be used to describe data for use in a report. The idea is that by applying XSL, all of the data could be displayed in a tabular format, certain data could be extracted for different display, etc. It should also be structured so that a graphing utility could read in the values and display them in a graph. I've done some preliminary work on such a schema, but I'd rather not reinvent the wheel. Thanks. Tim McCune Software Engineer, ChannelPoint "Don't worry about the future. The real troubles in your life are apt to be things that never crossed your worried mind, the kind that blindside you at 4 pm on some idle Tuesday." -- Kurt Vonnegut xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Apr 7 18:30:44 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> Message-ID: <370B8826.5F507A45@goon.stg.brown.edu> David Megginson wrote: > > I don't see anything in the spec that says "don't read and validate > > external parsed entities if they're not used." And in fact, the spec > > seems to say that, in order to be valid, they must (whether used or not) > > match certain productions in the grammar. > > You could check them for well-formedness (I guess), but you could not > validate them out of context I sympathize with this view. But your making an implicit apology on behalf of the spec, which actually just says: 1) The document entity is well-formed if it matches the production labeled document. 2) An external general parsed entity is well-formed if it matches the production labeled extParsedEnt. 3) An external parameter entity is well-formed if it matches the production labeled extPE There's no mincing words about "using" entities (in the sense of adding an entity reference to a spot where the reference will expand). All the spec says is that validity depends on entities matching certain productions in the grammar. It's a simple, static definition of how all the entities must be structured. It says nothing about operational ques- tions like whether you have to wait to validate until the entity appears in a place where it will be expanded. > You could check them for well-formedness (I guess), but you could not > validate them out of context ^^^ Sure you could. But obviously an external entity, in this scenario, would come out invalid if you declared it at a point where parameter entities it uses are not yet declared. So just make sure you do that. It's what the spec says, right? ;-) Also, parsers, when they check external entities, will have to make temporary copies of their parents' entity tables. Why? Because any given external entities may define more entities that it itself uses. (A typical case would be defining a parameter entity that later gets ex- panded to "INCLUDE"). So we have to keep a record of what's been de- fined. On the other hand, if the parent entity never references the external entity, we don't want definitions within the external entity leaking into the parent's tables. An exception to this is the top- level external DTD entity, which is always "used" and whose definitions we always want to leak back into the parent's tables. If IE's parser interprets the spec the way it's written, it will have to do all of these things. I reiterate my belief that the XML standard was written with SGML prac- tice in mind. If you know what SGML parsers typically do in such situ- ations, you know immediately what the XML spec editors really meant to say. The question of whether what they actually _did_ say will work in practice is another matter. STG's parser, by the way, compromises between these two approaches. On the one hand, it does not insist that external entities validate at the point where they are declared. On the other hand, it still scans the entities, whether they are used or not, and emits error messages if it finds any obvious problems. This seems a reasonable approach. I'd guess (not having tested it myself) that it's what IE is doing as well. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 18:56:34 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <370B8826.5F507A45@goon.stg.brown.edu> References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> Message-ID: <14091.36216.262969.842315@localhost.localdomain> Richard L. Goerwitz writes: > Also, parsers, when they check external entities, will have to make > temporary copies of their parents' entity tables. Why? Because > any given external entities may define more entities that it itself > uses. (A typical case would be defining a parameter entity that > later gets ex- panded to "INCLUDE"). So we have to keep a record > of what's been defined. On the other hand, if the parent entity > never references the external entity, we don't want definitions > within the external entity leaking into the parent's tables. An > exception to this is the top- level external DTD entity, which is > always "used" and whose definitions we always want to leak back > into the parent's tables. This is a bizarre reading of the spec and certainly not what was intended. If you have found language that supports (or even doesn't specifically preclude) this interpretation, could you suggest what needs to be fixed in XML 1.1 to make it clearer? Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 19:06:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:06 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) In-Reply-To: <370B8826.5F507A45@goon.stg.brown.edu> References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> Message-ID: <14091.36402.664143.504008@localhost.localdomain> Richard L. Goerwitz writes: > All the spec says is that validity depends on entities matching > certain productions in the grammar. No, XML 1.0 says that *well-formedness* depends on entities matching certain productions in the grammar. The spec contains no concept of validity for individual entities. Here's what it does say: (from 2.1 Well-Formed XML Documents) A textual object is a well-formed XML document if: 1.Taken as a whole, it matches the production labeled document. 2.It meets all the well-formedness constraints given in this specification. 3.Each of the parsed entities which is referenced directly or indirectly within the document is well-formed. (from 2.8 Prolog and Document Type Declaration) An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it. (from 4. Physical Structures) An XML document may consist of one or many storage units. These are called entities; they all have content and are all (except for the document entity, see below, and the external DTD subset) identified by name. Each XML document has one entity called the document entity, which serves as the starting point for the XML processor and may contain the whole document. I don't think that there's much room for ambiguity here: 2.1 defines what a well-formed document is, 2.8 further defines clearly what it means for a well-formed document to be valid, and 4 makes it clear that multiple entities still count as a single document. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Wed Apr 7 19:10:48 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <8725674C.005E3B25.00@d53mta06h.boulder.ibm.com> Elliotte, > yesterday. XJParse 1.1.14 can now handle this document from the Web site > but not when the files are loaded from the local hard drive where it still > reports AllenWatson.xml not found. XJParse 2.0.4 seems to work from both > the local hard drive and the Web site, but it's hard to tell since it > doesn't report as much as XJParse 1.1.14. How are you supplying the filename to XJParse in version 1.1.14? You may have to use a fully qualified URI -- even though it's local. For example: file:///path/file.xml -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Wed Apr 7 19:51:31 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:06 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) In-Reply-To: <002901be802d$24f65680$27f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <001f01be811f$bb74b5f0$f7d45dc7@eps.inso.com> > >An alternative method for achieving the same result is to use a filter > >(this can be done in Apache and in Jigsaw) which automatically emits the > >correct charset parameter based on reading the encoding declaration in > >the XML instance. > > I think this is the approach that, ultimately, we all are > hoping will be deployed. I'm not keen on this approach, but it would be a step in the right direction. I have some servlets for doing this, and for also handling the *.mim type. I still dislike the encoding information in the PI.... as you noted there are 3 levels at which mistakes can be made, and the PI means that you might have to "fix" the document in the face of transcoding. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ksall at cen.com Wed Apr 7 20:14:01 1999 From: ksall at cen.com (Sall, Ken) Date: Mon Jun 7 17:11:06 2004 Subject: State Machines; IBM's XMI? Message-ID: <7847B57C7C96D2119DBE00A0C96F64B6206E19@cen1.cen.com> Hello, Does anyone know of efforts to represent state machines (FSA) in XML? From their DTD, it would seem that IBM's XMI (XML Metadata Interchange) does, but a quick search of their 2 PDF docs didn't reveal "state machine", just "state". http://www.software.ibm.com/ad/features/xmi.html TIA - Ken Sall ksall@cen.com, kensall@home.com - XML at Web Developers Virtual Lib http://WDVL.com/Authoring/Languages/XML/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 20:15:27 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail References: <2F2DC5CE035DD1118C8E00805FFE354C0F3626A9@RED-MSG-56> Message-ID: <370B9F84.D1AAF86A@eng.sun.com> Elliotte Rusty Harold wrote: > > I think what this whole mess is showing, given the widely varying problems > with so many parsers, is that validation is not nearly as simple as it > seems, especially when the validators are asked to handle large files. Actually, the IE5 bug doesn't relate to validation ... :-) I think this really reflects a general need for better conformance testing. I know that Sun has invested a considerable amount of effort in complying with the XML 1.0 specification. When we run other parsers through our test suite the results don't always look very good at all. That is one of the problems that led us to write our own XML processor: the others did not conform well enough to the specification, which quickly leads to interoperabilty problems, undermining XML as an open platform level standard. At this time I know of four sets of test cases for evaluating conformance with the XML 1.0 specification. The OASIS working group on XML conformance is integrating these: - Of course, James Clark's XMLTEST suite ... good basic coverage for well-formedness, and some output tests; - Fuji Xerox test cases ... for vendors who want to handle the Japanese market, some characters and encodings are key; - Sun's test cases ... primarily to test validation, but there are a bunch of other cases that are covered; - OASIS tests ... good coverage of a number of grammar rules, not yet generally available. Does anyone know of any other generally available sets of test cases? Or have any tests cases they'd like to make generally available? If so, please contact me! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 20:20:28 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:06 2004 Subject: XML Torture Test: Parsers Fail References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> Message-ID: <370BA00E.6CB239A@eng.sun.com> Chris Olds wrote: > > I'm not so sure that IE5 is wrong in reporting an error (when unreferenced > General Entities are DTD chunks). I am ... :-) As noted in other postings, WF-ness is a document characteristic, and the document doesn't "include" unreferenced entities so it doesn't matter what they seem to look like, or even whether they exist or not. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Wed Apr 7 20:33:15 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:11:06 2004 Subject: Megginson and XMLNews Message-ID: Congratulations to David Megginson for leadership and tangible results in the 'XMLNews' application. This looks quite promising. I have created some references in: http://www.oasis-open.org/cover/xmlnewsORG.html but most of the referenced documents are (canonically) at the XMLNews Web site: http://www.xmlnews.org/ See now: http://www.xmlnews.org/press/19990407-01.html [April 07, 1999] "XMLNews Initiative Announced. Corel and WavePhore Support XMLNews in New Products." - "David Megginson, principal of Megginson Technologies, today announced a new initiative for news information delivery over the Internet. XMLNews uses the popular Extensible Markup Language (XML) to enable exchange of news information and metadata across different platforms and system configurations. Dr. Megginson, who chairs the World Wide Web Consortium's (W3C) XML Information Set Working Group and maintains the widely-implemented Simple API for XML (SAX), said that XMLNews brings together existing Web and Industry standards into a single package. 'XMLNews is good news for everyone in the industry," he said. 'With a single standard format for all feeds, XMLNews will make it easier to share news all along the distribution chain, from reporters in the field and international press agencies to end-users such as news portals and corporate intranets'." -robin cover SGML/XML Web Page http://www.oasis-open.org/cover/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rwbl70 at email.sps.mot.com Wed Apr 7 20:41:06 1999 From: rwbl70 at email.sps.mot.com (Jim Holt) Date: Mon Jun 7 17:11:07 2004 Subject: State Machines; IBM's XMI? References: <7847B57C7C96D2119DBE00A0C96F64B6206E19@cen1.cen.com> Message-ID: <370BA627.B75620EB@email.sps.mot.com> Yes, several people are working on this with XMI. Check out (http://www.ics.uci.edu/pub/arch/uml/) This is a UML tool that reads/writes XMI. There are examples of designs with state machines. There is also an XMI mailing list: For subscription details and a HTML archive of the list, please visit http://www.dstc.edu.au/mof/MailingLists.html -jim "Sall, Ken" wrote: > Hello, > > Does anyone know of efforts to represent state machines (FSA) in XML? From > their DTD, it would seem that IBM's XMI (XML Metadata Interchange) does, but > a quick search of their 2 PDF docs didn't reveal "state machine", just > "state". > > http://www.software.ibm.com/ad/features/xmi.html > > TIA > - Ken Sall ksall@cen.com, kensall@home.com > - XML at Web Developers Virtual Lib > http://WDVL.com/Authoring/Languages/XML/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- +-------------------------------------------------------------------+ | Jim Holt Motorola | | jholt@lakewood.sps.mot.com M-Core Technology Center | | (512) 342-6524 Lakewood 7600C, MD: TX77-F51 | | | | "640K ought to be enough for anybody." -- Bill Gates, 1981 | +-------------------------------------------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Apr 7 20:56:14 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:07 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <370B9F84.D1AAF86A@eng.sun.com> Message-ID: <001101be8122$1c159440$3d978bcf@total.net> Hi David, First thank you for the useful info. Second: Do you have any link for: Sun's test case. Add to the list Rick Jelliffe test suite for chinese encoding. I have lost the link, but I am sure Rick will be happy to provide the link. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of David Brownell Sent: Wednesday, April 07, 1999 2:10 PM To: Elliotte Rusty Harold Cc: xml-dev@ic.ac.uk Subject: Re: XML Torture Test: Parsers Fail Elliotte Rusty Harold wrote: > > I think what this whole mess is showing, given the widely varying problems > with so many parsers, is that validation is not nearly as simple as it > seems, especially when the validators are asked to handle large files. Actually, the IE5 bug doesn't relate to validation ... :-) I think this really reflects a general need for better conformance testing. I know that Sun has invested a considerable amount of effort in complying with the XML 1.0 specification. When we run other parsers through our test suite the results don't always look very good at all. That is one of the problems that led us to write our own XML processor: the others did not conform well enough to the specification, which quickly leads to interoperabilty problems, undermining XML as an open platform level standard. At this time I know of four sets of test cases for evaluating conformance with the XML 1.0 specification. The OASIS working group on XML conformance is integrating these: - Of course, James Clark's XMLTEST suite ... good basic coverage for well-formedness, and some output tests; - Fuji Xerox test cases ... for vendors who want to handle the Japanese market, some characters and encodings are key; - Sun's test cases ... primarily to test validation, but there are a bunch of other cases that are covered; - OASIS tests ... good coverage of a number of grammar rules, not yet generally available. Does anyone know of any other generally available sets of test cases? Or have any tests cases they'd like to make generally available? If so, please contact me! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Apr 7 21:13:05 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:07 2004 Subject: State Machines; IBM's XMI? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F25510A@MDYNYCMSX1> XMI is for documents that represent UML information. Of the various categories of diagrams that make up UML, state diagrams look a lot like FSA diagrams, which probably inspired some of their notation, but they are used to diagram the relationship between the possible states of a given class of objects. A more general-purpose FSA DTD shouldn't be hard to develop. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Wed Apr 7 21:17:32 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:11:07 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> <14091.36402.664143.504008@localhost.localdomain> Message-ID: <370BAF3C.D54E5C3C@goon.stg.brown.edu> David Megginson wrote: > 3.Each of the parsed entities which is referenced directly or > indirectly within the document is well-formed If I've seemed harsh, then forgive me. I have a great deal of respect for your views, and I don't think you're wrong here per se. While I agree with what you've inferred about the standard, I'm not at all certain that the standard itself forces your interpretation. In the above case, for example, the standard is talking about well-formed docu- ments as if all parsed entities must be read in if used in the document. In fact, this is not a requirement. The whole reason parameter entities, e.g., are not supposed to be used inside markup in the internal DTD sub- set is that this allows us to bypass them if you're not validating. (Incidentally, does it bother anyone else that you can have valid docu- ments that aren't well-formed? Imagine an external entity used inside an attribute value? If declared in such a way that a non-validating parser doesn't realize it's external, then the validating parser will reject it as an error (can't have external entities in this context). There are other such cases, although this is the main one that comes to mind.) My general point is that the question of what you do while validating is not simply a superset of what you do when just parsing with well-formed- ness in mind. You process documents in somewhat different ways depending on which of these two alternatives you've chosen. And so the question of what context an external entity should be checked in, if validating, is not clearly answered from the spec without exegesis, and I would ar- gue, background knowledge. Anyway, even if I grant that it says what you want it to, then the point should still be made that it does so in a way that's not easy to interpret or understand. The fact that the writers of IE's parser apparently got it wrong is therefore not at all unexpected. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 21:46:50 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:07 2004 Subject: XML Torture Test: Parsers Fail References: <001101be8122$1c159440$3d978bcf@total.net> Message-ID: <370BB4F3.47CA73DE@eng.sun.com> Didier PH Martin wrote: > > Hi David, > > First thank you for the useful info. > > Second: Do you have any link for: > Sun's test case. For the moment: http://java.sun.com/people/db/suntest.zip > Add to the list Rick Jelliffe test suite for chinese encoding. I have lost > the link, but I am sure Rick will be happy to provide the link. Good reminder; thanks! - Dave > Regards > Didier PH Martin > mailto:martind@netfolder.com > http://www.netfolder.com > > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > David Brownell > Sent: Wednesday, April 07, 1999 2:10 PM > To: Elliotte Rusty Harold > Cc: xml-dev@ic.ac.uk > Subject: Re: XML Torture Test: Parsers Fail > > Elliotte Rusty Harold wrote: > > > > I think what this whole mess is showing, given the widely varying problems > > with so many parsers, is that validation is not nearly as simple as it > > seems, especially when the validators are asked to handle large files. > > Actually, the IE5 bug doesn't relate to validation ... :-) > > I think this really reflects a general need for better conformance > testing. I know that Sun has invested a considerable amount of > effort in complying with the XML 1.0 specification. When we run > other parsers through our test suite the results don't always look > very good at all. > > That is one of the problems that led us to write our own XML processor: > the others did not conform well enough to the specification, which > quickly leads to interoperabilty problems, undermining XML as an open > platform level standard. > > At this time I know of four sets of test cases for evaluating > conformance with the XML 1.0 specification. The OASIS working > group on XML conformance is integrating these: > > - Of course, James Clark's XMLTEST suite ... good basic > coverage for well-formedness, and some output tests; > > - Fuji Xerox test cases ... for vendors who want to handle > the Japanese market, some characters and encodings are key; > > - Sun's test cases ... primarily to test validation, but > there are a bunch of other cases that are covered; > > - OASIS tests ... good coverage of a number of grammar > rules, not yet generally available. > > Does anyone know of any other generally available sets of test > cases? Or have any tests cases they'd like to make generally > available? If so, please contact me! > > - Dave > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Wed Apr 7 21:53:34 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:07 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: <8725674C.005E3B25.00@d53mta06h.boulder.ibm.com> Message-ID: At 11:09 AM -0600 4/7/99, andyclar@us.ibm.com wrote: >Elliotte, > >> yesterday. XJParse 1.1.14 can now handle this document from the Web site >> but not when the files are loaded from the local hard drive where it >still >> reports AllenWatson.xml not found. XJParse 2.0.4 seems to work from both >> the local hard drive and the Web site, but it's hard to tell since it >> doesn't report as much as XJParse 1.1.14. > >How are you supplying the filename to XJParse in version 1.1.14? You may >have to use a fully qualified URI -- even though it's local. For example: > > file:///path/file.xml > No, I just checked and that still fails in exactly the same way using XJParse 1.1.14. For what it's worth this is on Windows NT. It would not be out of the question that there are some implicit Unix assumptions in the source code that might be causing problems. Most of one chapter in my latest book is dedicated to dealing with exactly those sorts of problems, because they seem to be extremely common. The xml4j source code isn't available, is it? +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 22:08:16 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:07 2004 Subject: XML Test Suites (was Re: XML Torture Test: Parsers Fail) In-Reply-To: <370B9F84.D1AAF86A@eng.sun.com> References: <2F2DC5CE035DD1118C8E00805FFE354C0F3626A9@RED-MSG-56> <370B9F84.D1AAF86A@eng.sun.com> Message-ID: <14091.47403.254761.669863@localhost.localdomain> David Brownell writes: > Does anyone know of any other generally available sets of test > cases? Or have any tests cases they'd like to make generally > available? If so, please contact me! This isn't a test suite per se, but my XML Heart of Darkness has been an old standby for parser writers, because it tests the retrieval of large parsed entities (with text declarations) over HTTP. You can find it at http://home.sprynet.com/~dmeggins/texts/darkness/darkness.xml All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 22:11:53 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:07 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) In-Reply-To: <370BAF3C.D54E5C3C@goon.stg.brown.edu> References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> <14091.36402.664143.504008@localhost.localdomain> <370BAF3C.D54E5C3C@goon.stg.brown.edu> Message-ID: <14091.47957.492902.550776@localhost.localdomain> Richard L. Goerwitz writes: > (Incidentally, does it bother anyone else that you can have valid docu- > ments that aren't well-formed? Imagine an external entity used inside > an attribute value? If declared in such a way that a non-validating > parser doesn't realize it's external, then the validating parser will > reject it as an error (can't have external entities in this context). > There are other such cases, although this is the main one that comes > to mind.) This seems to be an example of a well-formed document that's not valid, not of a valid document that's not well-formed. Can you elaborate? Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Wed Apr 7 22:19:17 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:07 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> <14091.36402.664143.504008@localhost.localdomain> <370BAF3C.D54E5C3C@goon.stg.brown.edu> Message-ID: <370BBC96.FDBACC3C@eng.sun.com> "Richard L. Goerwitz" wrote: > > (Incidentally, does it bother anyone else that you can have valid docu- > ments that aren't well-formed? No, because it can never happen ... :-) I think you meant to ask "does it bother anyone that some nonvalidating parsers may accept documents that are not well-formed?" on the grounds that detecting the WF-ness error requires reading an external entity that's referenced (!) but is not included. > Imagine an external entity used inside > an attribute value? If declared in such a way that a non-validating > parser doesn't realize it's external, then the validating parser will > reject it as an error (can't have external entities in this context). That document wouldn't be valid since it's not well formed ... but there can be nonvalidating parsers that do not recognize that WF-ness error. > My general point is that the question of what you do while validating is > not simply a superset of what you do when just parsing with well-formed- > ness in mind. That doesn't match my understanding of the spec. Could you give an example of a case where a validating processor does something other than report something that a nonvalidating one won't? - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Apr 7 22:44:00 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:07 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) Message-ID: <3.0.32.19990407134059.00bc76b0@pop.intergate.bc.ca> At 03:17 PM 4/7/99 -0400, Richard L. Goerwitz wrote: I'm with David Megginson here - you really have to stand on one leg and not think of the word "rhinocerous" to see the XML spec as mandating the checking of unreferenced entities. >(Incidentally, does it bother anyone else that you can have valid docu- >ments that aren't well-formed? No you can't. It's not an XML document if it's not well-formed, and validity is a property of XML documents. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Apr 7 22:49:47 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:07 2004 Subject: Megginson and XMLNews Message-ID: <84285D7CF8E9D2119B1100805FD40F9F25510B@MDYNYCMSX1> So Robin, when will we see feeds of http://www.oasis-open.org/cover/sgmlnew.html tagged to conform to xmlnews-story.dtd? Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 7 23:13:08 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:07 2004 Subject: Megginson and XMLNews In-Reply-To: <84285D7CF8E9D2119B1100805FD40F9F25510B@MDYNYCMSX1> References: <84285D7CF8E9D2119B1100805FD40F9F25510B@MDYNYCMSX1> Message-ID: <14091.51440.961191.507453@localhost.localdomain> DuCharme, Robert writes: > So Robin, when will we see feeds of > http://www.oasis-open.org/cover/sgmlnew.html tagged to conform to > xmlnews-story.dtd? You know, this isn't such a bad idea -- I wonder if OASIS would allow Robin's work to be sent out as a free newsfeed, like PR Newswire? If they would, I will personally (here and in public view) volunteer to write, document, support, and maintain production-grade Perl scripts to add the appropriate XMLNews markup. Robin's news is of a very high quality, and it would be nice if it could have some new distribution channels. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Apr 7 23:38:06 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:07 2004 Subject: Megginson and XMLNews Message-ID: <3.0.32.19990407143650.00cdd450@pop.intergate.bc.ca> At 05:12 PM 4/7/99 -0400, David Megginson wrote: >DuCharme, Robert writes: > > > So Robin, when will we see feeds of > > http://www.oasis-open.org/cover/sgmlnew.html tagged to conform to > > xmlnews-story.dtd? > >You know, this isn't such a bad idea -- I wonder if OASIS would allow >Robin's work to be sent out as a free newsfeed, like PR Newswire? If >they would, I will personally (here and in public view) volunteer to >write, document, support, and maintain production-grade Perl scripts >to add the appropriate XMLNews markup. And I will volunteer staging & infrastructure to be provided by XML.com, should such a feed become available. (I would have volunteered to help with the script-ware too, but since David charged heroically into the breach, I wouldn't dream of getting in his way). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Thu Apr 8 00:47:15 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:07 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <8725674C.007D09A0.00@d53mta06h.boulder.ibm.com> Elliotte, > because they seem to be extremely common. The xml4j source code isn't > available, is it? Both versions of the XML4J parser come with full source code. -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Thu Apr 8 00:54:56 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:08 2004 Subject: Refactoring SAX 1.0 Message-ID: <8725674C.007DBBB6.00@d53mta06h.boulder.ibm.com> David, David Brownell wrote: > There aren't that many classes in SAX 1.0, and they can be used > as-is without "refactoring" anything at all. And, importantly, > without sacrificing compatibility. > > Or am I missing something in what you're suggesting? Slightly. Consider the case of an XML parser implementing org.xml.sax.Parser. Should a DOM parser have methods to register stream based handlers? Yet, besides the handler registration, DOM parsers would benefit from a standard programmatic way of initiating a parse, resolving entities, and handling errors. And the factoring would not have to sacrifice compatibility. I'm not completely caught up on the SAX2 discussion but I seem to recall talk about new interfaces/packages. I thought that if that work is going to be done, we could refactor the general SAX interfaces and classes at the same time. -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oxenberg at ma.ultranet.com Thu Apr 8 01:25:44 1999 From: oxenberg at ma.ultranet.com (Phil Oxenberg) Date: Mon Jun 7 17:11:08 2004 Subject: extrnal DTD and msxml parser Message-ID: <370BE96C.9DD85252@ma.ultranet.com> I would have thought this was correct, but I get an exception saying the dtd can not be found. Both the xml file and dtd reside in the same folder. Thanks Phil xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stark at uplanet.com Thu Apr 8 01:49:41 1999 From: stark at uplanet.com (Peter Stark) Date: Mon Jun 7 17:11:08 2004 Subject: PITarget uniqueness Message-ID: <001701be8151$451f3600$7ac2c6c3@uplanet.com> How do I guarantee that the target name of my Processing Instructions (PI) are unique? It's probably out of the scope of the XML spec. to specify, but are there any general guidelines for naming PIs? I am layering XML directly on top of UDP and TCP, and use PIs to send instructions to the application (identfied by the PITarget). The alternative would have been to invent a new HTTP-level protocol, or use HTTP, but for the purpose of what I am doing, it felt like overkill. I have read the Extensible Protocol [draft-harding-extensible-protocol-00] draft. Any opinions about this usage of PIs? Thanks, Peter Peter Stark stark@uplanet.com http://www.uplanet.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Apr 8 01:50:46 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:11:08 2004 Subject: Megginson and XMLNews In-Reply-To: Message-ID: <3.0.5.32.19990407163917.00bd2a60@corp> At 01:32 PM 4/7/99 -0500, Robin Cover wrote: >Congratulations to David Megginson for leadership and tangible >results in the 'XMLNews' application. This looks quite >promising. Since the e-mail links at xmlnews.org are "not yet active", and real-world DTDs are generally interesting, I'll post comments and questions here. An acceptable answer for most of these would be "to be compatible with NITF", but it would be nice to hear the rationale. These are all about the xmlnews-story DTD. Why instead of an xml:lang attribute? The ISO 8601 subset for is a different subset than the web profile of ISO 8601 recomended by the W3C. Any chance of changing to the W3C profile? The element does not offer the date in a parseable format. #PCDATA is fine for the printed version of the date, but it also should be given in an ISO 8601 form (see above), and if I get to choose, I'd rather see it as an element than as an attribute. Why is misspelled? , too? Is there some reason why #FIXED wasn't used to make follow the Xlink draft? That is: etc. Not necessary, of course, and XLink is a draft, and it introduces namspaces, ... Since doesn't contain the thing it is a pronunciation of, should it always follow that thing? And should that be noted in the spec? is an unusual term for "author" or "creator", even for a profession that routinely uses "slug". A

would be nice, though it looks like I'll be able to reliably extract the lead paragraph for single stories. Things get trickier for news summaries, since the first is a summary of some other story, not of the current document. is an excellent thing to include in a DTD for web use. XML docs tend to be missing random bits of necessary HTML functionality. is a good convention for others to follow. Now we need a convention for the robots meta tag ... Overall, the DTD looks good, and with the exception of , it will be trivial to map it into our search engine. The date issue is important, though, because people do want to sort news by date. In fact, that is almost the only kind of content that people do want to sort by date. Finally, I'd really appreciate a source of sample stories, so we can add this to our test suite. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Thu Apr 8 01:52:22 1999 From: richard at goon.stg.brown.edu (Richard Goerwitz) Date: Mon Jun 7 17:11:08 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) References: <3.0.32.19990407134059.00bc76b0@pop.intergate.bc.ca> Message-ID: <370BEF37.9DF0FFAC@goon.stg.brown.edu> Tim Bray wrote: > I'm with David Megginson here - you really have to stand on one > leg and not think of the word "rhinocerous" to see the XML spec as > mandating the checking of unreferenced entities. You're not going to see me arguing that you intended it to be read the rhino way. Nor are you going to see me claim that this was any- body else's actual intent. My contention is that if the spec actually says that entities may only be checked by a validating parser if used, it does so in a way that requires exegesis. Someone coming at the spec fresh, with lit- tle background in SGML, and without any foreknowledge of where you are headed. It may sound ludicrous, but I remember it taking me several readings of the spec, a look at some rather complex XML documents, and some preliminary implementation work, to realize that, for validating parsers, unreferenced entities must be left unchecked. If this is what's happened in IE, then yes, it's somewhat mysterious how they could have gotten as far as they did without realizing there were problems. But although DM has given me reason to think the spec is clearer than I have maintained, the fact is that I understand how they might have initially gone astray. > > (Incidentally, does it bother anyone else that you can have valid > > documents that aren't well-formed? It's probably not worth explaining what I meant to say here. -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anhai at cs.washington.edu Thu Apr 8 04:43:28 1999 From: anhai at cs.washington.edu (AnHai Doan) Date: Mon Jun 7 17:11:08 2004 Subject: [HELP] Finding XML data on the Web Message-ID: Hi, I'm doing a research project on data integration, and need a lot of XML data, the kind we would have seen in a typical real-world application. I have searched the Web but found very few XML documents, which are either for very small applications, or are interesting, but irrelevant (such as Shakespeare work being wrapped in XML). It seems like right now there is no "real" data in XML format that are available on the Web. If you know of any source of significant XML data on the Web, could you please give me some pointers? I would also be interested in finding some DTDs out there for commercial domains such as buying and selling software products, cars, books, CDs, etc. Lastly, I would also be interested in sources that provide data in the native format (that is, data not wrapped in HTML). I greatly appreciate your time and help. Best, AnHai. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Thu Apr 8 05:24:41 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:11:08 2004 Subject: [HELP] Finding XML data on the Web In-Reply-To: Message-ID: <3.0.5.32.19990407232333.021388e0@nexus.polaris.net> At 07:43 PM 4/7/99 -0700, AnHai Doan wrote: >I have searched the Web but >found very few XML documents, which are either for very >small applications, or are interesting, but irrelevant (such >as Shakespeare work being wrapped in XML). It seems like >right now there is no "real" data in XML format that are >available on the Web. A few months ago, Dave Winer posted the following URL on xml-dev: http://www.hotbot.com/text/default.asp?SM=MC&MT=&search=SEARCH&DC=100&DE=0&A M0=MC&AT0=words&AW0=&AM1=MN&AT1=words&AW1=&savenummod=2&date=WH&DV=0&DR=newe r&DM=1&DD=1&DY=98&FSU=1&FS=.xml&RD=AN&Domain=&RG=all&PS=A&PD=&_v=2&OPs=MDRTP &NUMMOD=2 (You'll need to reformat it from this e-mail message to get it all on one line in the browser window's URL field.) It runs a HotBot search for Web sites serving XML documents. When I ran it just now, it came up with 800+ hits. Quality and relevance to your research may of course vary, but it may be useful as a starting point. ========================================================== John E. Simpson | The secret of eternal youth simpson@polaris.net | is arrested development. http://www.flixml.org | -- Alice Roosevelt Longworth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Thu Apr 8 07:34:58 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:11:08 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <199904080534.AA00233@archlute.apsdc.ksp.fujixerox.co.jp> >As far as making the relative priorities explicit. I hope there is no >intention to supercede the XML encoding declaration as the normative way >in which documents, not being wrapped by some higher protocol which >treats the text at some more generic level, announce their character >set. As far as I know, no change is planned. For a historical and unfortunate reason, determination of the charset of WWW pages has become an extremely hard problem. Since XML is not only for human readable documents but also for programs and database systems, failture to determine the charset leads to devastating results such as corrupted database. XML 1.0 and RFC 2376 do not provide a perfect solution, and I am not quite happy (probabaly, nobody is completely happy). However, after loooong discussion, we have painfully learned that there are no alternatives on which majoriy of HTTP/MIME/XML people would agree. I sincerely hope that people will follow these two specifications and that we will have better interoperability. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From om at lgsi.co.in Thu Apr 8 10:50:39 1999 From: om at lgsi.co.in (Om Band) Date: Mon Jun 7 17:11:08 2004 Subject: Dnamic Change in contents of XML ????????? Message-ID: <370C6C23.842970FB@lgsi.co.in> Hi, In our project we want an XML page which has 2 Select-Boxes (Combo Boxes) first with catagories & other with it's subcatagories. What we want is according to any catagory we select form first Combo-Box the sub-catagories displayed in the other Combo-box should change dynamically that too at client side only! Is this possible with XML either with the help of any Script ??? If not, are menus & sub-menu's possible with XML ??? (Like "START" in Windows") Regds...... -Om xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From costello at mitre.org Thu Apr 8 12:28:18 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:11:08 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) Message-ID: <370C84FF.BC94CF98@mitre.org> In section 3 of the RDF Model & Syntax spec it talks about containers, e.g., rdf:Bag, rdf:Alt, and rdf:Seq. It gives an example where the model and syntax is shown for the following statement: "The students in course 6.001 are Amy, Tim, John, Mary, and Sue." The model for this statement shows a resource, /courses/6.001, having a property, students, whose value is an anonymous resource (i.e., a resource with no identifier). The anonymous resource has an rdf:type property whose value is rdf:Bag. It has a property rdf:_1 whose value is /Students/Amy. It has a property rdf:_2 whose value is /Students/Tim, etc. The spec shows the syntax for this model as: This confuses me. It does not seem to faithfully represent the model. Recall that the model says that resource, /courses/6.001, has a property, students, whose value is an *anonymous resource*. This syntax does not seem to be expressing that. This syntax says that the value is an rdf:Bag, not an anonymous resource. Here's how I would write the syntax: The way I read my version, the resource, /courses/6.001, has a property, students, whose value is an anonymous resource. The anonymous resource has a type property whose value is rdf:Bag, and so on. Isn't this a more faithful representation of the model? I must be not understanding something about container objects. Would someone please explain this to me? /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Apr 8 12:48:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:11:08 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <370C84FF.BC94CF98@mitre.org> Message-ID: (cc'd to www-rdf-comments; please trim from any followups on xml-dev) On Thu, 8 Apr 1999, Roger L. Costello wrote: > In section 3 of the RDF Model & Syntax spec it talks about containers, > e.g., rdf:Bag, rdf:Alt, and rdf:Seq. It gives an example where the > model and syntax is shown for the following statement: > > "The students in course 6.001 are Amy, Tim, John, Mary, and Sue." > > The model for this statement shows a resource, /courses/6.001, having a > property, students, whose value is an anonymous resource (i.e., a > resource with no identifier). The anonymous resource has an rdf:type > property whose value is rdf:Bag. It has a property rdf:_1 whose value > is /Students/Amy. It has a property rdf:_2 whose value is > /Students/Tim, etc. > > The spec shows the syntax for this model as: > > > > > > > > > > > > ~> > > > This confuses me. It does not seem to faithfully represent the model. > Recall that the model says that resource, /courses/6.001, has a > property, students, whose value is an *anonymous resource*. This syntax > does not seem to be expressing that. This syntax says that the value is > an rdf:Bag, not an anonymous resource. [...] The value is *both* an rdf:Bag and an anonymous resource. Just because we don't have an identifier for the resource, it doesn't mean we can't know other properties of it (such as its type, ie. the class of which it is a member). The syntax tells us that it is a Bag, but doesn't give a URI or ID to represent that Bag. Having an rdf:type property pointing to rdf:Bag is just RDF's way of telling us that the anonymous resource "is A" rdf:Bag. I suspect "type" rather than "isA" was used to name this relation/property since "isA" is sometimes used ambiguously, ie. also used as a name for the "subClassOf" relation. Hope this helps, Dan -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9287096 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Thu Apr 8 13:24:42 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:08 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: Message-ID: <370A9AE7.BCE60202@w3.org> Jelks Cabaniss wrote: > > Chris Lilley wrote: > > > I don't sense consensus yet on whether client-side validation is always > > desirable; it clearly is in some cases and clearly adds little in other > > cases. > > Wouldn't it depend on what the client is? Yes. Which is why I wrote that I don't sense concensus on this - there are arguments both for and against; for rewuireing validation, for never requiring it, etc. > The creation of XML 1.0 (as opposed > to just well-formed SGML) made validation optional; didn't the designers have > browsers in mind when they made this decision; in fact wasn't it the MAIN reason > they made such a decision? Probably, you would need to ask them. > > The assertion has been made that client-side validation is a performance > > load, compared to just parsing the dtd looking for fixed attributes etc; > > but no performance figures were made available. If someone has a parser > > they could instrument and provide some actual measurements on real-world > > data, that would help. > > Assuming that validation were equally as fast, I still don't think that makes a > case for *forcing web browsers* to do what XML 1.0 says is optional. True, but the assertion was made that validation should never be required because of the performance load (compared to parsing the dtd including external subsets, but not validating). Which implies, if there were negligible performance load, them whould validation be a desirable thing? > In another message: > > > My feeling is that there are three classes of implementation, that > > should all have names: > > > > minimal well-formed - never tries to follow external entities > > full well-formed - always tries to follow external entities > > full validating - always tries to follow external entities and validates > > Agreed. Currently, the first two are not adequately distinguished, it seems. And, it seems that there are a lot of implementations that fall into the second class - perhaps that is even the majority class. > > and it should be possible to always derive what class of implementation > > a particular instance requires. You don't comment on that sentence, so does it mean you agree? > > My current take on this is that > > > > "standalone="yes" is how you declare that a minimal well-formed parser > > is sufficient; that > > Sounds good. But, it seems, that standalone-="no" does not meanthat a minimal well-formed parser has to reject the document witha well formedness error. But some people seem to think that would be desirable behaviour. Or perhaps another value for "standalone" would be needed. > > > you indicate that a validating parser is required There are two related but separate assertions that can be made 1) this document is valid 2) this document needs a validating parser I didn't adequately distinguish these before, which was remis of me. > I don't like this (though evidently a number of people are assuming or > advocating it). I didn't like it much either, but it seemed to be, on inspection, what the XML spec said. James Clark seemed to agree, which was a good sign. But I recently heard Tim Bray and CMSQ say that no, it doesn't mean that at all and in fact the presence of element declarations should not be construed to mean that the document is valid or that there is a self-consistent dtd in there which could be validated against. > If validation is optional, it's optional -- even if there's a > stray in the DTD. I am tending to agree that this is what the spec says. So, there is in fact no way to indicate the assertion "this document is valid". > Maybe the author is building his DTD and > doesn't want to validate it until he's good and ready. Maybe it's an older DTD, > he doesn't care about validity any more, and all he wants are default attributes > for styling purposes. Must he remove all a web browser? These are good arguments. I observe that many parsers which can validate stop performing well -formedness chacks and start trying to do validation checks instead once they see element declarations, but it seems that this is not warranted (not to UI designers, provide two separate icons for "validate" and "check wf" ) > If there is to be a way to *force* validity by specifying it in the document > instance, the only way I can see is by amending the spec with something like (as > I believe you yourself suggested in passing) > > valid="yes" > > in the declaration. Right. With a default of "no", of course. So, this would make the assertion that the document was valid and that assertions could be tested and perhaps refuted, by a validating parser. In the case of "valid="no" or perhaps, valid="wf", a validating parser would do what - declare the document invalid? Agree, yes, its invalid (so why check it)? Automatically use a non-validating mode, even if it was normally validating? > > and that all othger cases are saying that the full-well-formed parser is > > required. > > That sounds good. > > But IMO "all other cases" should currently include documents having DTDs with > If documents should in fact be able to demand "Hey, if you're a validating > parser, validate me NOW!" (and there do seem to be some compelling occasions for > it), a Message-ID: <370C9642.48EEA0B8@mitre.org> Dan Brickley wrote: > > Having an rdf:type property pointing to rdf:Bag is just RDF's way of > telling us that the anonymous resource "is A" rdf:Bag. > Is it *always* the case that if an RDF model has a resource with a property rdf:type whose value is foo then rather than: [other properties] the syntax is: [other properties] Example: Rather than: [other properties] the syntax is: [other properties] /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Thu Apr 8 13:51:54 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:11:08 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <370C9642.48EEA0B8@mitre.org> Message-ID: <000201be81b6$166807a0$0200a8c0@kuantech1> No. RDF defines alternative syntax for particular abstract models. You may use whichever syntax you like. The examples you included are equivalent. This is part of both the flexibility and seeming difficulty at first glance of RDF. You can express a particular abstract model multiple ways in XML. It is important and helpful to remember that RDF is not XML. The RDF spec just happens to specify how to concretely represent RDF in XML. Jeff > -----Original Message----- > From: owner-xml-dev@ic.ac.uk > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > Roger L. Costello > Sent: Thursday, April 08, 1999 4:43 AM > To: xml-dev@ic.ac.uk > Cc: www-rdf-comments@w3.org; costello@mitre.org > Subject: Re: RDF Question: about syntax of rdf container objects (Bag, > Alt, Seq) > > > Dan Brickley wrote: > > > > Having an rdf:type property pointing to rdf:Bag is just RDF's way of > > telling us that the anonymous resource "is A" rdf:Bag. > > > > Is it *always* the case that if an RDF model has a resource with a > property rdf:type whose value is foo then rather than: > > > > [other properties] > > > the syntax is: > > > [other properties] > > > Example: > > Rather than: > > > > [other properties] > > > the syntax is: > > > [other properties] > > > /Roger > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Apr 8 13:58:14 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:11:09 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <370C9642.48EEA0B8@mitre.org> Message-ID: yOn Thu, 8 Apr 1999, Roger L. Costello wrote: > Dan Brickley wrote: > > > > Having an rdf:type property pointing to rdf:Bag is just RDF's way of > > telling us that the anonymous resource "is A" rdf:Bag. > > > > Is it *always* the case that if an RDF model has a resource with a > property rdf:type whose value is foo then rather than: > > > > [other properties] > > > the syntax is: > > > [other properties] > > > Example: > > Rather than: > > > > [other properties] > > > the syntax is: > > > [other properties] > > Yep. From: http://www.w3.org/TR/REC-rdf-syntax/ 2.2.2. Basic Abbreviated Syntax While the serialization syntax shows the structure of an RDF model most clearly, often it is desirable to use a more compact XML form. The RDF abbreviated syntax accomplishes this. As a further benefit, the abbreviated syntax allows documents obeying certain well-structured XML DTDs to be directly interpreted as RDF models. Three forms of abbreviation are defined for the basic serialization syntax. [...] The third basic abbreviation applies to the common case of a Description element containing a type property (see Section 4.1 for the meaning of type). In this case, the resource type defined in the schema corresponding to the value of the type property can be used directly as an element name. [...] So, yes, a general XML/RDF parser would need to handle such cases. The spec gives EBNF for these constructs in section 2.2.2 Dan -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9287096 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Apr 8 14:02:50 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:11:09 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <000201be81b6$166807a0$0200a8c0@kuantech1> Message-ID: Ah, in that case I slightly misunderstood the question in my posted-seconds-ago reply. The 'typedNode' abbreviated syntax is not, as Jeffrey points out, mandatory, so while the second syntax below is always legal, it's also always optional. IMHO it makes for more readable and in-spirit-of-XML markup... Dan On Thu, 8 Apr 1999, Jeffrey E. Sussna wrote: > No. RDF defines alternative syntax for particular abstract models. You may > use whichever syntax you like. The examples you included are equivalent. > This is part of both the flexibility and seeming difficulty at first glance > of RDF. You can express a particular abstract model multiple ways in XML. It > is important and helpful to remember that RDF is not XML. The RDF spec just > happens to specify how to concretely represent RDF in XML. > > Jeff > > > -----Original Message----- > > From: owner-xml-dev@ic.ac.uk > > [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > > Roger L. Costello > > Sent: Thursday, April 08, 1999 4:43 AM > > To: xml-dev@ic.ac.uk > > Cc: www-rdf-comments@w3.org; costello@mitre.org > > Subject: Re: RDF Question: about syntax of rdf container objects (Bag, > > Alt, Seq) > > > > > > Dan Brickley wrote: > > > > > > Having an rdf:type property pointing to rdf:Bag is just RDF's way of > > > telling us that the anonymous resource "is A" rdf:Bag. > > > > > > > Is it *always* the case that if an RDF model has a resource with a > > property rdf:type whose value is foo then rather than: > > > > > > > > [other properties] > > > > > > the syntax is: > > > > > > [other properties] > > > > > > Example: > > > > Rather than: > > > > > > > > [other properties] > > > > > > the syntax is: > > > > > > [other properties] > > > > > > /Roger > > > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Thu Apr 8 17:06:01 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:11:09 2004 Subject: Megginson and XMLNews Message-ID: <9C998CDFE027D211B61300A0C9CF9AB4424742@SNYC11309> Is it done on purpose? If you are trying to see DTD in IE5 browser, it reports: The 'version' attribute is required at this location. Line 1, Position 7 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 8 17:22:32 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:09 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) Message-ID: <3.0.32.19990408082200.00cdacd0@pop.intergate.bc.ca> At 01:38 AM 4/7/99 +0200, Chris Lilley wrote: I mostly agree with Chris, but a couple of notes. >True, but the assertion was made that validation should never be >required because of the performance load (compared to parsing the dtd >including external subsets, but not validating). Which implies, if there >were negligible performance load, them whould validation be a desirable >thing? It still depends. In lots of scenarios, validation isn't worth even an extra 1% investment in time. In lots of scenarios, there's no DTD! Having said that, a guaranteed-lightweight validation protocol would be nice. >I didn't like it much either, but it seemed to be, on inspection, what >the XML spec said. >James Clark seemed to agree, which was a good sign. But I recently heard >Tim Bray and CMSQ say that no, it doesn't mean that at all What the spec itself says is more important than what any of us individuals say. I really don't think you can find justification in the spec for the notion that the existence of a (not to UI designers, provide two >separate icons for "validate" and "check wf" ) Yes! IE5 has a nice validation capability, but no way (that I've found) for the user to invoke it. Is there one? >Next question, should there be (in other words, is this something that >should be in the document instance). Good question; I can see both sides. But in fact, Chris, I think what's motivating you here is less a concern for forcing validation than a concern for forcing the use of the external DTD for entity declarations & attribute defaults and so on. Which is fine; but I think there are 2 separate questions here: - should a document be able to ask for validation - should a document be able to ask for guaranteed reading of all external entities Related but distinct. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Apr 8 17:54:13 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:09 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) In-Reply-To: <3.0.32.19990408082200.00cdacd0@pop.intergate.bc.ca> Message-ID: <199904081554.LAA22691@hesketh.net> At 08:22 AM 4/8/99 -0700, Tim Bray wrote: >>(not to UI designers, provide two >>separate icons for "validate" and "check wf" ) > >Yes! IE5 has a nice validation capability, but no way (that I've >found) for the user to invoke it. Is there one? See http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/defau lt.asp. I don't know that it counts as 'user invocation' the way you meant, though. >Good question; I can see both sides. But in fact, Chris, I think what's >motivating you here is less a concern for forcing validation than a >concern for forcing the use of the external DTD for entity declarations >& attribute defaults and so on. Which is fine; but I think there are >2 separate questions here: > > - should a document be able to ask for validation > - should a document be able to ask for guaranteed reading of all > external entities > >Related but distinct. -Tim And that's precisely why XML Processing Description Language (XPDL) separates them. See http://purl.oclc.org/NET/xpdl for details. It also provides a mechanism for making the readability of these features optional, when appropriate, though the default requires the resources to be read. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Apr 8 17:57:32 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:11:09 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL Message-ID: Hello everyone, We've just done a 'soft' launch of an on-line magazine called World Link. It's 'soft' because not all of the data is in yet and we're not publicising it, but enough is done to start receiving comments - if you have time ;-) The site is at: http://www.worldlink.co.uk/ Our client is very keen to make proper use of this technology, so if anyone does have comments - don't hold back! (I may regret this :-) The structure is basically: 1. All data is stored in a hierarchical database. Here we have articles, issues, countries, people, companies, events and whatever. 2. Data is extracted from the database as required, as XML. There is no such thing as a 'document' in the normal sense; we create each XML document on the fly by simply extracting a node and all its children. Digging data out just requires the start node to be specified. Our new version implements this better with fragments and a small part of XQL. 3. The XML document is combined with XSL on the server. This was too slow in the previous IE incarnation so we evolved to taking 'snapshots' of the HTML. However, the release version seems faster, and some other changes we have made have increased the performance of the data extraction, so our new version of the site - still in development - actually lets the user combine the data on their browser or performs the combine on the fly. 4. World Link staff can make cross-references between data in the database - articles on the same theme, data entries on countries, and so on - using either our database tools or just typing an XML link syntax around the object. These links are extracted later to make ordinary HTML links. By allowing connections to be just XML we allow for other tools to come along that they could use on their data. 5. Linking to external sites goes through the database, so users create links using a keyword. It's a first attempt at out-of-line links, although we have a much neater version imminent, which uses transclusion. Any link under the heading 'External' on the right side is an OOLL and uses this technique. (Any comments Guy?) 6. Searches using the search field in the top right are traditional searches, where we simply search for the word you enter. e.g., Turkey will find both the country *and* the bird. 7. Searches using the 'Fact Finder' areas will search within the actual category. e.g., a search for Turkey within the 'Countries' area will only search for the country. (It will still find people, companies and articles - it just finds those that refer to the country not the bird.) At the moment this depends on connections being made in the database by World Link via our admin tool, so they are still being entered. If you want to try it out, the Jan/Feb 1999 issue has the most - select this issue from the drop-box and then select 'companies' from the fact finder. 8. The obvious advantage of this technique is to search for only what you want - for example the *country* Turkey, not the bird - but another advantage is that any article that connects to a database object inherits any AKA data from the object. For example, if in the entry for USA there are AKA entries of 'North America' and 'the States', any search for either of these values will yield the relevant article, even if those words don't actually appear in that article. 9. Whilst most of the site is currently a 'snapshot' of previously combined XML and XSL, the search results are actually exported as XML documents and then a stylesheet is applied to create the HTML on the fly. The next release will allow the user or other servers to receive this XML directly - to do with what they will! It's a bit messy at the moment but if you want a look at the crude stuff so far, stick 'debug=true' on the end of any query you've run. For example: http://www.worldlink.co.uk/Find/ObjectRef/67.htm?q1=&duration=THISISSUE& c1=@all&SH=0&objectType=Company&debug=true Sorry there's not a massive amount of direct XML to see, but the next release will expose everything. An interim addition will be an extra icon on each page to allow the XML version of the page to be viewed - probably next week. Still - all comments gratefully received. (If anyone thinks it might be useful I can make available a preview of the new site which has been completely written from scratch based on what we learned from the current site. It uses schemas, fragments and ... XLink!) Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Apr 8 18:04:23 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:11:09 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <000201be81b6$166807a0$0200a8c0@kuantech1> References: <370C9642.48EEA0B8@mitre.org> Message-ID: <3.0.5.32.19990408085233.00bc6da0@corp> At 04:51 AM 4/8/99 -0700, Jeffrey E. Sussna wrote: >No. RDF defines alternative syntax for particular abstract models. You may >use whichever syntax you like. The examples you included are equivalent. >This is part of both the flexibility and seeming difficulty at first glance >of RDF. It isn't a "seeming" difficulty, it is a real problem. Two syntaxes are much, much less useful than one. Having two or more ways to say the same thing (zip, jar, and cab for Java) is almost always a bad idea. The reason given for the compressed RDF syntax, "it's smaller", is never a good enough reason. Either use the small one, use the clear one, or make one that is small enough and clear enough. Specs are the wrong place to prevaricate. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Apr 8 18:05:36 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:11:09 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: <370A9AE7.BCE60202@w3.org> Message-ID: <370CD346.C7EC147D@jclark.com> Chris Lilley wrote: > > > > > you indicate that a validating parser is required > > There are two related but separate assertions that can be made > > 1) this document is valid > 2) this document needs a validating parser > > I didn't adequately distinguish these before, which was remis of me. > > > I don't like this (though evidently a number of people are assuming or > > advocating it). > > I didn't like it much either, but it seemed to be, on inspection, what > the XML spec said. > James Clark seemed to agree Where did you get that idea from? On XSL-List, I said in response to you: > Chris Lilley wrote: > > > If that understanding ( > then please point it out. > > That's probably about the best heuristic there is and would work most > (although not all) the time, but there's nothing in the spec to warrant > the claim that this is the intent of the spec, and there's certainly no > requirement in the spec that a validating parser implement this. > > In any case I'm not sure it would be a good idea for a *browser* > automatically to use such a heuristic. Given: > > > > ... > > I don't want to wait for the browser to read "doc.dtd" before it > displays the document to me. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Thu Apr 8 18:34:18 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:11:09 2004 Subject: RDF Question: about syntax of rdf container objects (Bag, Alt, Seq) In-Reply-To: <3.0.5.32.19990408085233.00bc6da0@corp> Message-ID: [ Can we drop www-rdf-comments@w3.org from the cc: list now? ] On Thu, 8 Apr 1999, Walter Underwood wrote: > At 04:51 AM 4/8/99 -0700, Jeffrey E. Sussna wrote: > >No. RDF defines alternative syntax for particular abstract models. You may > >use whichever syntax you like. The examples you included are equivalent. > >This is part of both the flexibility and seeming difficulty at first glance > >of RDF. > > It isn't a "seeming" difficulty, it is a real problem. Two syntaxes > are much, much less useful than one. Having two or more ways to say > the same thing (zip, jar, and cab for Java) is almost always a bad > idea. The reason given for the compressed RDF syntax, "it's smaller", > is never a good enough reason. Either use the small one, use the clear > one, or make one that is small enough and clear enough. Specs are > the wrong place to prevaricate. It's not really two syntaxes, since all syntactic variants are part of the RDF syntax specification. (as a side thought: if someone proposed a way of mapping arbitrary XML content into the RDF directed-labelled-graph data model (eg. by interpreting schemas or through XSL) would people complain that this made RDF even more syntactically flexible?) RDF is explicitly in the business of providing a common data model across multiple applications, and this includes PICS labels, embedded metadata in images and other obscure ways of shipping around statements about the properties of various Web objects. So multiple ways of discovering facts to store using the RDF data model is pretty much inevitable. A web indexing application for example might want to store RDF summaries of PDF files, PICS labels, HTML and XML docs, JPEG, GIF and PNG images. For there to be a single 1:1 mapping of RDF into a concrete syntax would pretty much squish the point of having it in first place. All that said, there is a better justification for RDF's syntactic flexibility than that of size: we need to be able to shove this stuff into the heads of normal Web documents without having it leak out in older browsers, hence the string-properties-as-attributes variant of the syntax. Dan -- Daniel.Brickley@bristol.ac.uk Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/ University of Bristol, Bristol BS8 1TN, UK. phone:+44(0)117-9287096 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Apr 8 19:14:03 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:09 2004 Subject: Megginson and XMLNews In-Reply-To: <9C998CDFE027D211B61300A0C9CF9AB4424742@SNYC11309> References: <9C998CDFE027D211B61300A0C9CF9AB4424742@SNYC11309> Message-ID: <14092.58216.512741.280193@localhost.localdomain> Ogievetsky, Nikita writes: > Is it done on purpose? > If you are trying to see DTD in IE5 browser, it reports: > > The 'version' attribute is required at this location. Line 1, Position 7 > > This is an interesting problem -- MSIE must consider the DTD files to be of type text/xml, but they are not XML documents. If they had an XML declaration at the top (rather than just an encoding declaration), they would be invalid as part of a complete document. I recommend downloading the DTDs and viewing them in a text editor. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Apr 8 19:22:17 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:09 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) Message-ID: <003c01be81dc$4c1e6080$26f96d8c@NT.JELLIFFE.COM.AU> From: Tim Bray > ..but I think there are 2 separate questions here: > > - should a document be able to ask for validation > - should a document be able to ask for guaranteed reading of all > external entities I think these questions in turn boil down to that the XML spec's section 2.9 "Standalone Document Declarations" says that it is a "validity constraint" if standalone="no" and the markup declarations contain certain kinds of data, but instead it should be a kind of "well-formedness constraint"! This in practise would create three classes of processors: 1) XML parsers which cannot accept documents which have standalone="yes" and which have markup declarations in the internal prolog (the URI on the DOCTYPE declaration reliably names the document type); 2) XML parsers which only accept documents which are standalone="yes", whether it then validates content models or not; 3) XML parsers which accept standalone="yes" or "no", whether it then validates or not. In other words, if standalone="no", then a parser must read all the external entities and handle the particular markup declarations; if it cannot, it should spit the Draconian dummy. This does not mean it has to validate against content models, however. If standalone="yes", then a parser should read the internal markup declarations and use them (for the particular cases mentioned: to get default attribute values, to get entity values). Tim's comment in his annotations presages this approach. If people are finding this a big problem, can XML be tightened up ASAP to correct it? Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Thu Apr 8 19:25:37 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:09 2004 Subject: XSL as XML transformation In-Reply-To: <3.0.32.19990408082200.00cdacd0@pop.intergate.bc.ca> Message-ID: <000101be81d5$f1d15e80$ab20268a@pc-lrd.bath.ac.uk> Hi, I've been reading through the XSL spec and I noticed that it mentions that XSL could be used as an XML transformation mechanism as the result tree need not use the formatting vocabulary. Does this mean that its possible to use XSL to do transformation from one XML document type to another? If so are there any gotchas that I should be concerned with? I've been casting about for a decent mechanism and was all set today to begin designing a custom system to do my job, albeit with much of the rules defined by an(other) XML document type. Then I read the XSL spec again and thought that perhaps I could get the transformation 'for free' as much of the transformation stuff I need to do (element renaming, attribute/element conversions,etc) seems to be possible. I'm aware that architectural forms offers some of this (actually I'm doing a many DTD, SGML -> single DTD XML conversion, so originally looked at SGML architectures) but I'm unclear as to how contentious or well defined this is for XML (as opposed to SGML). I'd really appreciate any comments people care to make. I could provide some more details if necessary. Ideally I want a relatively low maintenance solution. I'm prepared to do a custom implementation but don't want to miss the boat and find out I've reinvented the wheel (apologies for mixed metaphor ;). L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Thu Apr 8 19:51:10 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:11:09 2004 Subject: XSL as XML transformation Message-ID: <3.0.32.19990408134509.00807940@polaris.net> At 04:39 PM 4/8/1999 +0100, Leigh Dodds wrote: >I've been reading through the XSL spec and I noticed that it >mentions that XSL could be used as an XML transformation mechanism >as the result tree need not use the formatting vocabulary. > >Does this mean that its possible to use XSL to do transformation >from one XML document type to another? If so are there any gotchas >that I should be concerned with? Yes, you can do that with XSL. Gotchas (other than understanding everything you can about the structures of both the source and the result tree :), well, just make sure your namespace declarations ensure that the result tree's elements won't be prefixed. (That is, if your result tree's DTD -- if one -- includes a tag, you don't want the transformation to emit something like .) Oh, and use James Clark's xt for the transformation. Also, I think SAXON can output 2 (or more?) separate result trees, so you might want to take a look at it's XSL capabilities as well. ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Thu Apr 8 20:50:31 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:10 2004 Subject: Megginson and XMLNews References: <9C998CDFE027D211B61300A0C9CF9AB4424742@SNYC11309> <14092.58216.512741.280193@localhost.localdomain> Message-ID: <370CF949.DF471BD0@eng.sun.com> > > If you are trying to see DTD in IE5 browser, it reports: > > > > The 'version' attribute is required at this location. Line 1, Position 7 > > > > > > This is an interesting problem -- MSIE must consider the DTD files to > be of type text/xml, but they are not XML documents. If they had an > XML declaration at the top (rather than just an encoding declaration), > they would be invalid as part of a complete document. Actually, this looks like another IE5 XML bug ... - 4.3.2 of the XML spec says an external PE (like a DTD fragment!) may have an optional "Text Declaration". - 4.3.1 says that the text declaration may have (or omit) a version - 2.8 explains the rule for the XML declaration It's always safe to have an external entity with this at the front: The "version" may be dropped if it _is NOT_ the document entity. The "encoding" may be dropped if it _is_ the document entity. Only the document entity may have a "standalone" attribute. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 8 21:04:48 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:10 2004 Subject: XSL as XML transformation Message-ID: <93CB64052F94D211BC5D0010A80013310EB3F7@WWMESS3.172.19.125.2> > Does this mean that its possible to use XSL to do transformation > from one XML document type to another? Yes. > If so are there any gotchas that I should be concerned with? Yes. 1. There must be a one-to-one mapping of input documents to output documents. 2. There are no facilities for algorithmic transformations of attributes or element content. 3. There are no facilities for adding "grouping" nodes (e.g. changing

Mike Kay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990408/d63b7e25/attachment.htm From ricko at allette.com.au Thu Apr 8 21:17:33 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:10 2004 Subject: XSL as XML transformation Message-ID: <004d01be81ec$64332150$26f96d8c@NT.JELLIFFE.COM.AU> From: John E. Simpson >At 04:39 PM 4/8/1999 +0100, Leigh Dodds wrote: >>Does this mean that its possible to use XSL to do transformation >>from one XML document type to another? If so are there any gotchas >>that I should be concerned with? Another gotcha is (as far as I can figure out) that you cannot do transformations like [given a source content model for x of (y, z, w) ] to In other words, XSL works best if you are using data in which all grouping of element types are explicitly done using tags. This is a major pain: I have had to alter the QAML (FAQ) DTD for it. I would be happy to be wrong: this is probably the wrong list to ask. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Thu Apr 8 22:18:47 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:10 2004 Subject: Megginson and XMLNews References: <9C998CDFE027D211B61300A0C9CF9AB4424742@SNYC11309> <14092.58216.512741.280193@localhost.localdomain> <370CF949.DF471BD0@eng.sun.com> Message-ID: <370D0B87.D737A706@eng.sun.com> Dawns on me that the real issue here is probably that IE5 doesn't let you browse DTD files directly. Reasonable, and if so, not at all a bug. As David noted, (different words) they may be XML, but they're not documents and when you try to view an XML "file" the browser must assume it's a document. Ergo the diagnostic. If the text declaration looked like an XML declaration (with version) you'd get an error because the first declaration was for an element or somesuch, rather than a document! - Dave p.s. XMLNews ... cool stuff! I'll look at it in more detail soon. > > > If you are trying to see DTD in IE5 browser, it reports: > > > > > > The 'version' attribute is required at this location. Line 1, Position 7 > > > > > > > > > > This is an interesting problem -- MSIE must consider the DTD files to > > be of type text/xml, but they are not XML documents. If they had an > > XML declaration at the top (rather than just an encoding declaration), > > they would be invalid as part of a complete document. > > Actually, this looks like another IE5 XML bug ... > > - 4.3.2 of the XML spec says an external PE (like a DTD fragment!) > may have an optional "Text Declaration". > - 4.3.1 says that the text declaration may have (or omit) a version > - 2.8 explains the rule for the XML declaration > > It's always safe to have an external entity with this at the front: > > > > The "version" may be dropped if it _is NOT_ the document entity. > The "encoding" may be dropped if it _is_ the document entity. > Only the document entity may have a "standalone" attribute. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Thu Apr 8 23:04:58 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:10 2004 Subject: Megginson and XMLNews Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F3626F8@RED-MSG-56> That is exactly right. The real problem here is that the www.xmlnews.org server is serving up this DTD with the mime type "text/xml" even though a DTD is strictly NOT an XML document. If it served it up as plain text then IE5 would not trigger the XML Mime Viewer and this client side error would be avoided. I have tested that a DTD (or any external entity) containing loads fine when it is loaded as a DTD and not directly as an XML document. Chris Lovett. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Thu Apr 8 23:54:39 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:11:10 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) Message-ID: I would ask what is the reason for a document needing validation parsing. I see 3 reasons: 1. To include entities and attribute defaults that are external to the document 2. To indicate the document should match a given structure (the DTD) 3. To describe the structure the document matches On the second point, it can be argued that any document that declares its DTD met that structure when it was created. In other words, you don't expect such a document to fail validation. So why go through validation at all? On the third point, consider the cases where a DTD can't completely express the constraints on the document. Or where the application that produced the document and the one that consumes it both implement the document structure in code rather than a DTD. Such an application may use a well-formed parser to read the document and then apply the constraints via explicit code. For these reasons I would: 1. Allow a document to indicate the structure that it meets, which can be a totally abstract URI and/or a DTD. 2. Require well-formed parsers to handle attribute defaults, entities, and external files but not element declarations. 3. A validating parser would add processing element declarations and full attribute processing. 4. An application (the user of the parser) selects if the parser will validate and in fact can substitute its own selected DTD. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Simon St.Laurent [SMTP:simonstl@simonstl.com] Sent: Thursday, April 08, 1999 8:57 AM To: XML-Dev Mailing list Subject: Re: SUMMARY: XML Validation Issues (was: several threads) At 08:22 AM 4/8/99 -0700, Tim Bray wrote: >>(not to UI designers, provide two >>separate icons for "validate" and "check wf" ) > >Yes! IE5 has a nice validation capability, but no way (that I've >found) for the user to invoke it. Is there one? See http://msdn.microsoft.com/downloads/samples/internet/xml/xml_validator/ defau lt.asp. I don't know that it counts as 'user invocation' the way you meant, though. >Good question; I can see both sides. But in fact, Chris, I think what's >motivating you here is less a concern for forcing validation than a >concern for forcing the use of the external DTD for entity declarations >& attribute defaults and so on. Which is fine; but I think there are >2 separate questions here: > > - should a document be able to ask for validation > - should a document be able to ask for guaranteed reading of all > external entities > >Related but distinct. -Tim And that's precisely why XML Processing Description Language (XPDL) separates them. See http://purl.oclc.org/NET/xpdl for details. It also provides a mechanism for making the readability of these features optional, when appropriate, though the default requires the resources to be read. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Apr 9 01:55:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:10 2004 Subject: Megginson and XMLNews In-Reply-To: <2F2DC5CE035DD1118C8E00805FFE354C0F3626F8@RED-MSG-56> References: <2F2DC5CE035DD1118C8E00805FFE354C0F3626F8@RED-MSG-56> Message-ID: <14093.16787.371710.589062@localhost.localdomain> Chris Lovett writes: > That is exactly right. The real problem here is that the > www.xmlnews.org server is serving up this DTD with the mime type > "text/xml" even though a DTD is strictly NOT an XML document. If > it served it up as plain text then IE5 would not trigger the XML > Mime Viewer and this client side error would be avoided. Actually, www.xmlnews.org is serving it up as text/plain -- MSIE must be acting on the .dtd extension. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Fri Apr 9 02:18:49 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:10 2004 Subject: Refactoring SAX 1.0 Message-ID: <8725674E.000198F0.00@d53mta06h.boulder.ibm.com> David, David Brownell wrote: > I'm still unclear on what you're suggesting ... suspect I'm > not alone in it! Do you have an example, maybe? > > Remember that compatibility is a ground rule for everyone. Yes, I've been thinking a lot about this recently and have come up with an example of how the SAX 1.0 could be refactored. First, I looked at the SAX interfaces and classes that are of general use to all parsers. My list includes the following: org.xml.sax.EntityResolver org.xml.sax.ErrorHandler org.xml.sax.InputSource org.xml.sax.Locator org.xml.sax.Parser org.xml.sax.SAXException org.xml.sax.SAXParseException The EntityResolver, InputSource, and Locator objects can be used without modification. Because general error handling is useful to all parsers, I would rename SAXException to XMLException and do the same for SAXParseException. These name changes would affect the ErrorHandler callback methods. I would keep the Parser interface the same except for removing the callback registration methods (setDTDHandler and setDocumentHandler) that are specific to a stream based parser. To enable the stream based parsing functionality, I would add a SAXParser interface that extends the new Parser interface to add the callback registration methods. (I'm not including code examples of what these new interfaces and classes look like in order to keep this posting short. But I could post further if there was interest.) I realize that compatability with SAX 1.0 is very important to retain. I would suggest keeping the org.xml.sax and org.xml.sax.helpers packages the same -- don't change them at all. Nothing would be moved or renamed. In this way, users of SAX 1.0 would not have to change their code. The refactored SAX could be moved to new packages. For example, org.xml and org.xml.sax2. The org.xml package would contain all of the general purpose interfaces and objects and org.xml.sax2 would contain the specific SAX objects. The latter would include all of the SAX2 extensions being discussed now on the mailing list, as well. -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Fri Apr 9 02:49:01 1999 From: harvey at eccnet.eccnet.com (Betty Harvey) Date: Mon Jun 7 17:11:10 2004 Subject: Graphics, Namespaces and DTDs & IE5 Message-ID: <370D4D24.1324310C@eccnet.com> I have several questions so I am going to 'glump' them together in one: 1. Graphic entities do not work in IE 5.0. Does anyone know of a workaround besides XSL? 2. The following works in IE 5.0: This is a test When I add a DTD in the document declaration subset, I get an error: Reference to undeclared namespace prefix: 'html'. Line 17, Position 1 ^ This is the document: ]> This is a test If I take all references to Namespaces out document there are no parsing errors but of course no graphics? Am I missing something? I can't find any references to the DTD in the Namespace spec. TIA!! Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Apr 9 03:21:07 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:10 2004 Subject: State Machines; IBM's XMI? References: <7847B57C7C96D2119DBE00A0C96F64B6206E19@cen1.cen.com> Message-ID: <014c01be8226$bb60bd40$0300000a@cygnus.uwa.edu.au> > Does anyone know of efforts to represent state machines (FSA) in XML? From > their DTD, it would seem that IBM's XMI (XML Metadata Interchange) does, but > a quick search of their 2 PDF docs didn't reveal "state machine", just > "state". I was playing around with FSAs in XML a little while ago (actually, all sorts of things: recognizers, generators, transducers; finite-state, recursive). My initial work was using the DTD: I considered using an NMTOKEN attribute for the label of the arc, ie: Another possibility would have been to "declare" labels: If people have comments on this, I'm more than willing to put something more formal together. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Apr 9 03:21:11 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:10 2004 Subject: State Machines; IBM's XMI? References: <7847B57C7C96D2119DBE00A0C96F64B6206E19@cen1.cen.com> Message-ID: <014d01be8226$bc967e20$0300000a@cygnus.uwa.edu.au> > Does anyone know of efforts to represent state machines (FSA) in XML? From > their DTD, it would seem that IBM's XMI (XML Metadata Interchange) does, but > a quick search of their 2 PDF docs didn't reveal "state machine", just > "state". I was playing around with FSAs in XML a little while ago (actually, all sorts of things: recognizers, generators, transducers; finite-state, recursive). My initial work was using the DTD: I considered using an NMTOKEN attribute for the label of the arc, ie: Another possibility would have been to "declare" labels: If people have comments on this, I'm more than willing to put something more formal together. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Apr 9 03:21:21 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:10 2004 Subject: State Machines; IBM's XMI? References: <7847B57C7C96D2119DBE00A0C96F64B6206E19@cen1.cen.com> Message-ID: <014e01be8226$bd7852a0$0300000a@cygnus.uwa.edu.au> > Does anyone know of efforts to represent state machines (FSA) in XML? From > their DTD, it would seem that IBM's XMI (XML Metadata Interchange) does, but > a quick search of their 2 PDF docs didn't reveal "state machine", just > "state". I was playing around with FSAs in XML a little while ago (actually, all sorts of things: recognizers, generators, transducers; finite-state, recursive). My initial work was using the DTD: I considered using an NMTOKEN attribute for the label of the arc, ie: Another possibility would have been to "declare" labels: If people have comments on this, I'm more than willing to put something more formal together. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Fri Apr 9 03:41:22 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:10 2004 Subject: Megginson and XMLNews Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F362704@RED-MSG-56> Oh, I bet the That is exactly right. The real problem here is that the > www.xmlnews.org server is serving up this DTD with the mime type > "text/xml" even though a DTD is strictly NOT an XML document. If > it served it up as plain text then IE5 would not trigger the XML > Mime Viewer and this client side error would be avoided. Actually, www.xmlnews.org is serving it up as text/plain -- MSIE must be acting on the .dtd extension. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Apr 9 03:52:00 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:10 2004 Subject: PITarget uniqueness References: <001701be8151$451f3600$7ac2c6c3@uplanet.com> Message-ID: <01df01be822b$0fd3dca0$0300000a@cygnus.uwa.edu.au> > How do I guarantee that the target name of my Processing Instructions (PI) > are unique? It's probably out of the scope of the XML spec. to specify, but > are there any general guidelines for naming PIs? PI targets can be associated with a URI using the NOTATION mechanism. This is so much like the namespace mechanism that I'd really like it if the two were merged and the PI target made arbitrary. My FOP application (0.6.0 out RSN) reads PIs with target "FOP" but that could of course clash with "Fred's Outstanding Processor" or something. I would prefer to have FOP read PIs with a target that's a notation declared with the URI "http://www.jtauber.com/fop" but that would require the input to FOP having a DTD, which isn't something most (any?) XSL engines will do. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com XML Standards and Product Coordinator HarvestRoad Communications / www.harvestroad.com.au Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Apr 9 03:59:33 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:11 2004 Subject: Megginson and XMLNews References: <3.0.32.19990407143650.00cdd450@pop.intergate.bc.ca> Message-ID: <027701be822c$202c5220$0300000a@cygnus.uwa.edu.au> David Megginson: > >You know, this isn't such a bad idea -- I wonder if OASIS would allow > >Robin's work to be sent out as a free newsfeed, like PR Newswire? If > >they would, I will personally (here and in public view) volunteer to > >write, document, support, and maintain production-grade Perl scripts > >to add the appropriate XMLNews markup. Tim Bray: > And I will volunteer staging & infrastructure to be provided by XML.com, > should such a feed become available. And I volunteer likewise for XMLINFO.com. Having multiple sites showing different views of the same information is a nice application of XML to demo. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Fri Apr 9 04:50:22 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:11 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: Message-ID: <000801be822d$ca6c7fc0$62978bcf@total.net> HI Mark, 1. All data is stored in a hierarchical database. Here we have articles, issues, countries, people, companies, events and whatever. 2. Data is extracted from the database as required, as XML. There is no such thing as a 'document' in the normal sense; we create each XML document on the fly by simply extracting a node and all its children. Digging data out just requires the start node to be specified. Our new version implements this better with fragments and a small part of XQL. To better understand the process. Do I understand you well. The process is: a) Convert the hierarchical database data into a XML document (i.e text) b) process this document (i.e. text) with MSXML and a XSL style sheet c) send the produce HTML document to the client. Why do you convert the hierarchical database into XML? Is it because the hierarchical database do not have a DOM interface? Just curious to learn. Thank you for sharing your explerience. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mark_tracey at xtra.co.nz Fri Apr 9 06:45:27 1999 From: mark_tracey at xtra.co.nz (Mark Wilson) Date: Mon Jun 7 17:11:11 2004 Subject: XMLTREE - real XML websites Message-ID: <370D8401.4118DBA7@xtra.co.nz> What you are looking for is http://www.xmltree.com Cheers, Mark. VBXML working group - http://209.143.139.201/ in reply to: From: AnHai Doan Date: Wed, 7 Apr 1999 19:43:16 -0700 Subject: [HELP] Finding XML data on the Web Hi, I'm doing a research project on data integration, and need a lot of XML data, the kind we would have seen in a typical real-world application. I have searched the Web but found very few XML documents, which are either for very small applications, or are interesting, but irrelevant (such as Shakespeare work being wrapped in XML). It seems like right now there is no "real" data in XML format that are available on the Web. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From james at xmltree.com Fri Apr 9 10:28:17 1999 From: james at xmltree.com (james@xmlTree.com) Date: Mon Jun 7 17:11:11 2004 Subject: 'real' XML data Message-ID: <008601be8262$d4b80c30$0400a8c0@fourleaf.com> AnHai > It seems like > right now there is no "real" data in XML format that are > available on the Web. > > If you know of any source of significant XML data on the > Web, could you please give me some pointers? I am putting together a directory of XML content providers - so far only 60 or so listed but they include some interested examples - astronomy, oceanography, a newswire service etc. All are providing 'real' data. If anyone knows of other XML content providers, I would love to know. Also, the site is in Beta at the moment - all feedback is gratefully received : ) Best regards James Carlyle james@xmltree.com Wavefront Limited 70 Acton Street London WC1X 9NB UK (44) 171 813 0665 www.xmltree.com - directory of XML content on the web xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Fri Apr 9 11:23:27 1999 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:11:11 2004 Subject: XML Torture Test: Parsers Fail In-Reply-To: Elliotte Rusty Harold's message of "Wed, 7 Apr 1999 15:45:19 -0400" References: Message-ID: <199904090921.SAA23926@ns.trl.ibm.com> In message "Re: XML Torture Test: Parsers Fail" on 99/04/07, Elliotte Rusty Harold writes: > No, I just checked and that still fails in exactly the same way using > XJParse 1.1.14. For what it's worth this is on Windows NT. It would not be > out of the question that there are some implicit Unix assumptions in the > ... I have found a bug about external entities in XML4J 1.1.16 and fixed it. -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 9 14:19:47 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:11 2004 Subject: Check your namespaced documents Message-ID: <199904091219.NAA22506@stevenson.cogsci.ed.ac.uk> I am adding namespace support to RXP, and have added an option to my XML-checker page to include namespace processing. This causes the constraints listed in the namespaces recommendation to be checked. Please try it out and let me know of any problems. The URL is: http://www.cogsci.ed.ac.uk/~richard/xml-check.html -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Fri Apr 9 14:57:53 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:11:11 2004 Subject: Check your namespaced documents Message-ID: <3.0.32.19990409085134.007f0830@polaris.net> At 01:19 PM 4/9/1999 +0100, Richard Tobin wrote: >I am adding namespace support to RXP, and have added an option to >my XML-checker page to include namespace processing. This causes the >constraints listed in the namespaces recommendation to be checked. Tried it out on a couple of docs (including an XSL stylesheet) and it appears to behave satisfactorily. (Which as usual may speak more to the simplicity of my test cases than to the rigor of the product. :) One trivial suggestion: You might want to include a usage note on the page, to the effect that validating and namespace-checking are mutually exlusive options. Or change them from checkboxes to radio buttons. ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From boblyons at unidex.com Fri Apr 9 15:14:14 1999 From: boblyons at unidex.com (Robert C. Lyons) Date: Mon Jun 7 17:11:11 2004 Subject: ANNOUNCE: XML Convert 1.0 Message-ID: <01BE8268.8469ABC0@cc398234-a.etntwn1.nj.home.com> XML Convert 1.0 is a Java application that uses XFlat schemas to convert flat files into XML. XFlat is an XML language for defining flat file schemas. XML Convert uses an XFlat schema to parse and validate the flat file, and to produce the XML output. XML Convert supports a wide variety of flat file formats, including CSV, fixed length records and fields, multiple record types, groups of records, nested groups, etc. You may download XML Convert 1.0 for free at: http://www.unidex.com/download.htm Please send any comments or questions to mailto:boblyons@unidex.com. Thanks. Bob ------ Bob Lyons EC Consultant Unidex Inc. 1-732-975-9877 boblyons@unidex.com http://www.unidex.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 9 15:59:07 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:11 2004 Subject: problem with IE5 Message-ID: <199904091358.OAA25479@stevenson.cogsci.ed.ac.uk> Betty Harvey sent me mail about a document which was accepted by RXP but rejected by IE5. Here is a small example which shows the problem: ]> It produces this error in IE5: Reference to undeclared namespace prefix: 'foo'. Line 6, Position 1 It doesn't make any difference if I put a namespace declaration for foo on the test element. It looks as if IE5 somehow expects namespace prefixes in the DTD to be declared. Can anyone explain this? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 17:41:23 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904050220.AA00169@archlute.apsdc.ksp.fujixerox.co.jp> <370A0194.C175B421@w3.org> Message-ID: <370E1F91.3CEC8C95@locke.ccil.org> Chris Lilley wrote: > However, I will point out that it is the consensus of the XML 1.0 > Recommendation that I am respecting - and that the RFC does not, by > altering the meaning of the default encoding. It could have been > harmionised with the XML REC; it was not. I do not understand this. Appendix F plainly (though non-normatively) says: # If an XML entity is delivered with a MIME type of text/xml, then # the charset parameter on the MIME type determines the character # encoding method; all other heuristics and sources of information # [including, by implication, the encoding declaration] # are solely for error recovery. So saith the RFC as well, and indeed the Rec proclaims that its rules are only recommendations, and that the RFC controls. But in fact they do not conflict. > Redundancy can be good; a > charset parameter and an XML encoding declaration that say the same > thing and work the same way, which is what I was suggesting, is good. Yes, indeed. Nevertheless, the charset parameter has one advantage over the encoding declaration: it is guaranteed by MIME to be in ASCII, and thus always readable. A document with a Content-Type of "text/xml;charset=cp-ebcdic-us" can be affirmatively rejected by a client that does not understand EBCDIC, whereas a client which has only the encoding declaration may *suppose* that the document is EBCDIC, based on the Appendix F heuristics, but cannot *know*. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 18:05:21 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <002901be802d$24f65680$27f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <370E2517.6E59D3AF@locke.ccil.org> Rick Jelliffe wrote: > Given that an XML processor may transcode the document without knowing > the meanings of the elements (i.e., that the meta tag means something), > the XML encoding has to have priority over the HTML meta tag value. And > given that a proxies can transcode text/* files without knowing what > kind of text it is (i.e., that it is XML, and so has a label), the MIME > header has to have priority over the XML header PI. I think that is the > logical order: generic operations must be allowed. All extremely sound. > However, it is all spoiled if there are systems which corrupt the > labels: for example by rewriting the charset parameter incorrectly. It > is far better to send the XML file without a charset parameter than to > send it with a wrong one. But there's the snag: in text/xml documents, a missing charset parameter does not mean "Charset unspecified"; it means "Charset specified as US-ASCII". There is no way to fail to specify a charset in text/* documents, and rightly so, because text without a charset is uninterpretable. In SGML terms, omitting the charset in text/* documents is a mere minimization, whereas in application/* documents it is a true #IMPLIED. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 18:22:11 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> <370807CC.CC9F0E1D@w3.org> <3708314B.B98240BE@eng.sun.com> <370A2606.EB895110@w3.org> Message-ID: <370E2901.548FD26B@locke.ccil.org> Chris Lilley wrote: > MIME actually need not have those constraints; *email* has those > constraints (although increasingly it does not, in practice). HTTP is > always 8-bit clean. I agree that the MIME RFCs have steadfastly tried to > pretend that MIME is an email-only thing. The requirement for a charset with text/* documents (whether minimized or not) has nothing to do with email, although the fact that a minimized charset parameter means "US-ASCII" surely does. A charset must be specified for interoperable text (as opposed to application-specific formats) because text means nothing if you do not know the charset, and no text-using application can fail to either decode, guess, or simply presume some charset. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Nitin_Patel at adc.com Fri Apr 9 18:47:35 1999 From: Nitin_Patel at adc.com (Nitin Patel) Date: Mon Jun 7 17:11:11 2004 Subject: unsubscriber Nitin_Patel@adc.com Message-ID: <370E2CCE.4514669B@adc.com> unsubscriber Nitin_Patel@adc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 19:48:23 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: Validating Entities (was Re: XML Torture Test: Parsers Fail) References: <092801be8099$131cf8d0$dc59fcc6@salsa.walldata.com> <370B5FA2.CCDE2159@goon.stg.brown.edu> <14091.27586.410933.846017@localhost.localdomain> <370B8826.5F507A45@goon.stg.brown.edu> <14091.36402.664143.504008@localhost.localdomain> <370BAF3C.D54E5C3C@goon.stg.brown.edu> Message-ID: <370E3D53.8F385B25@locke.ccil.org> Richard L. Goerwitz wrote: > (Incidentally, does it bother anyone else that you can have valid docu- > ments that aren't well-formed? Not by definition: a non-WF document simply isn't an XML document at all, and calling it "valid" is no more meaningful than calling the text of this email "valid". In any case, I don't understand your example. > Imagine an external entity used inside > an attribute value? If declared in such a way that a non-validating > parser doesn't realize it's external, then the validating parser will > reject it as an error (can't have external entities in this context). References to external entities in attribute values are not WF. All this means is that a non-validating parser that doesn't read all the external parameter entities may fail to detect certain WF violations. A validating parser, or for that matter a non-validating but all-entity-reading parser, will correctly reject such a document as non-WF. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Fri Apr 9 19:49:27 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:11 2004 Subject: embedded XML project Message-ID: <199904091918.MAA23164@0000000.com> Hi, I have been working on an XML project for embedded systems (8/16/32 bit microcontrollers). I'd be interested in talking to anybody who is working in this space or has a desire to work together on a project. My project is well defined, with an intent to market it maybe sometime next year. If someone has a project of their own and there are common components, we could possibly share code and resources. Please contact me if you're interested. Thanks, Thomas 0@0000000.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 20:13:09 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: Megginson and XMLNews References: <3.0.5.32.19990407163917.00bd2a60@corp> Message-ID: <370E4307.92892367@locke.ccil.org> Walter Underwood wrote: > Why is misspelled? , too? These spellings are traditional in the news business, along with "graf" for "paragraph", "sked" for "schedule", "lede" for "lead [paragraph or sentence]", and some others. The original purpose of the misspellings was to clearly distinguish data from metadata: the annotation "hed" marks something as a headline, whereas the annotation "head" might be read as an instruction to insert the word "head". "Hed" is shorter than "", after all. :-) > is an unusual term for "author" or "creator", even for > a profession that routinely uses "slug". "Slug" does not mean "bytag"; it means "identifier". -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 20:16:48 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:11 2004 Subject: PITarget uniqueness References: <001701be8151$451f3600$7ac2c6c3@uplanet.com> Message-ID: <370E43F7.3C8747E5@locke.ccil.org> Peter Stark wrote: > How do I guarantee that the target name of my Processing Instructions (PI) > are unique? It's probably out of the scope of the XML spec. to specify, but > are there any general guidelines for naming PIs? Yes. The XML Rec specifically allows you to create NOTATION declarations that map the targets of PIs to URIs or FPIs, thus: ]> ... Then your application can match not on the name, which can be different in different documents, but on the consistent and unique URI. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Fri Apr 9 20:23:43 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:11:11 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) Message-ID: I was asked to produce an IE5 version of my SHOWTREE free resource and I found the results were interesting.? These reveal that alternate strategies are required with different products when accessing components of an XML document because the node structure interpreted by engines differ. Versions of SHOWTREE for both W3C(XT) and IE5 are available through the Resource Library link on our home page in my trailer below. The stylesheets produce a text file from XT and an HTML file from IE5 ... one can use the IE5 engine either directly viewing a file using an embedded stylesheet processing instruction, or use the DOS command line if you want to examine a file without changing it to include the PI (the command line invocation is another free resource in the library). I noted the following right away: (1) XT instantiates text nodes for new lines between elements where IE5 doesn't (2) XT regards the XML declaration as not being part of the content while IE5 does (3) IE5 exposes the namespace declaration attributes as attributes while XT doesn't (4) XT composes node names using the namespace URI while IE5 doesn't I haven't looked closely for other differences given some personal time constraints. I hope this is considered useful. ........... Ken T:\FTEMP>type short.xml To convey a greeting A testof depthto three levels Information about Greeting Hello world! T:\FTEMP>call xsl short.xml showtree.xsl short.w3c T:\FTEMP>call msxsl short.xml showtree.msxsl short.ms T:\FTEMP>type short.w3c SHOWTREE Stylesheet - http://www.CraneSoftwrights.com/resources/ Root: 1? Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 2? Proc. Inst. 'pi1' (): {value of pi1} 3? Comment (): {comment number 1} 4? Element 'greeting' (): 4.1? Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 4.2? Attribute 'attr2' (greeting): {Value of second attribute} 4.3? Text (greeting): { } 4.4? Element 'purpose' (greeting): 4.4.1? Text (greeting,purpose): {To convey a greeting} 4.5? Text (greeting): { } 4.6? Element 'test1' (greeting): 4.6.1? Text (greeting,test1): {A test} 4.6.2? Element 'test2' (greeting,test1): 4.6.2.1? Text (greeting,test1,test2): {of depth} 4.6.2.2? Element 'test3' (greeting,test1,test2): 4.6.2.2.1? Text (greeting,test1,test2,test3): {to three levels} 4.7? Text (greeting): { } 4.8? Comment (greeting): {comment number 2} 4.9? Text (greeting): { } 4.10? Element 'prelude' (greeting): 4.10.1? Attribute 'id' (greeting,prelude): {start} 4.10.2? Text (greeting,prelude): { } 4.10.3? Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 4.10.4? Text (greeting,prelude): { } 4.10.5? Element 'http://www.CraneSof twrights.com/resources/#info:detail' (greeting,prelude): 4.10.5.1? Text (greeting,prelude,h ttp://www.CraneSoftwrights.com/resources/#info:detail): {Information about Greeting} 4.10.6? Text (greeting,prelude): { } 4.11? Text (greeting): { } 4.12? Element 'value' (greeting): 4.12.1? Text (greeting,value): {Hello world!} 4.13? Text (greeting): { } T:\FTEMP>rem copy/paste from IE5 canvas to short.ms.txt T:\FTEMP>type short.ms.txt SHOWTREE Stylesheet file:///T:/FTEMP/short.xml http://www.CraneSoftwrights.com/ resources Root: 1 Proc. Inst. 'xml' (): {version="1.0"} 2 Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 3 Proc. Inst. 'pi1' (): {value of pi1} 4 Comment (): {comment number 1} 5 Element 'greeting' (): 5.1 Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 5.2 Attribute 'attr2' (greeting): {Value of second attribute} 5.3 Element 'purpose' (greeting): 5.3.1 Text (greeting,purpose): {To convey a greeting} 5.4 Element 'test1' (greeting): 5.4.1 Text (greeting,test1): {A test} 5.4.2 Element 'test2' (greeting,test1): 5.4.2.1 Text (greeting,test1,test2): {of depth} 5.4.2.2 Element 'test3' (greeting,test1,test2): 5.4.2.2.1 Text (greeting,test1,test2,test3): {to three levels} 5.5 Comment (greeting): {comment number 2} 5.6 Element 'prelude' (greeting): 5.6.1 Attribute 'id' (greeting,prelude): {start} 5.6.2 Attribute 'xmlns:info' (greeting,prelude): {http://www.CraneSoftwrights .com/resources/} 5.6.3 Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 5.6.4 Element 'info:detail' (greeting,prelude): 5.6.4.1 Text (greeting,prelude,info:detail): {Information about Greeting} 5.7 Element 'value' (greeting): 5.7.1 Text (greeting,value): {Hello world!} T:\FTEMP>type short.ms SHOWTREE Stylesheet file:///T:/FTEMP/short.xml http://www.CraneSoftwrights.com /resources/ Root: 1 Proc. Inst. 'xml' (): {version="1.0"} 2 Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 3 Proc. Inst. 'pi1' (): {value of pi1} 4 Comment (): {comment number 1} 5 Element 'greeting' (): 5.1 Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 5.2 Attribute 'attr2' (greeting): {Value of second attribute} 5.3 Element 'purpose' (greeting): 5.3.1 Text (greeting,purpose): {To convey a greeting} 5.4 Element 'test1' (greeting): 5.4.1 Text (greeting,test1): {A test} 5.4.2 Element 'test2' (greeting,test1): 5.4.2.1 Text (greeting,test1,test2): {of depth} 5.4.2.2 Element 'test3' (greeting,test1,test2): 5.4.2.2.1 Text (greeting,test1,test2,test3): {to three levels} 5.5 Comment (greeting): {comment number 2} 5.6 Element 'prelude' (greeting): 5.6.1 Attribute 'id' (greeting,prelude): {start} 5.6.2 Attribute 'xmlns:info' (greeting,prelude): {http://www.CraneSoftwrights.co m/resources/} 5.6.3 Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 5.6.4 Element 'info:detail' (greeting,prelude): 5.6.4.1 Text (greeting,prelude,info:detail): {Information about Greeting} 5.7 Element 'value' (greeting): 5.7.1 Text (greeting,value): {Hello world!} T:\FTEMP> -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Fri Apr 9 20:33:04 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) In-Reply-To: Message-ID: At 99/04/09 11:10 -0700, I wrote: >Versions of SHOWTREE for both W3C(XT) and IE5 are available through >the Resource Library link on our home page in my trailer below. Does anyone know how to access through IE5 script the URL used to invoke an HTML file? I was thinking I could do something like: file://t|/ftemp/runmsxsl.htm?thisfile.xml^thatfile.xsl ... to run a given XML file with a given XSL file and display the results. I know how to do everything except communicate the parameters through the URL used to invoke the HTML file with the script. Can anyone help? I'll post the resulting HTML file with the stylesheet resources. This will allow a user to look at the SHOWTREE results on the IE5 canvas without having to embed the stylesheet reference or use a temporary HTML file for the command line invoked results. Thanks! ............. Ken -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Fri Apr 9 21:12:16 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F36271A@RED-MSG-56> (1) IE5 gives you the option on text nodes between elements. These are called "ignorableWhiteSpace" and we have a switch on the IXMLDOMDocument for this. (2) IE5 gives you all PI's. I don't see why the type short.xml To convey a greeting A testof depthto three levels Information about Greeting Hello world! T:\FTEMP>call xsl short.xml showtree.xsl short.w3c T:\FTEMP>call msxsl short.xml showtree.msxsl short.ms T:\FTEMP>type short.w3c SHOWTREE Stylesheet - http://www.CraneSoftwrights.com/resources/ Root: 1� Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 2� Proc. Inst. 'pi1' (): {value of pi1} 3� Comment (): {comment number 1} 4� Element 'greeting' (): 4.1� Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 4.2� Attribute 'attr2' (greeting): {Value of second attribute} 4.3� Text (greeting): { } 4.4� Element 'purpose' (greeting): 4.4.1� Text (greeting,purpose): {To convey a greeting} 4.5� Text (greeting): { } 4.6� Element 'test1' (greeting): 4.6.1� Text (greeting,test1): {A test} 4.6.2� Element 'test2' (greeting,test1): 4.6.2.1� Text (greeting,test1,test2): {of depth} 4.6.2.2� Element 'test3' (greeting,test1,test2): 4.6.2.2.1� Text (greeting,test1,test2,test3): {to three levels} 4.7� Text (greeting): { } 4.8� Comment (greeting): {comment number 2} 4.9� Text (greeting): { } 4.10� Element 'prelude' (greeting): 4.10.1� Attribute 'id' (greeting,prelude): {start} 4.10.2� Text (greeting,prelude): { } 4.10.3� Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 4.10.4� Text (greeting,prelude): { } 4.10.5� Element 'http://www.CraneSof twrights.com/resources/#info:detail' (greeting,prelude): 4.10.5.1� Text (greeting,prelude,h ttp://www.CraneSoftwrights.com/resources/#info:detail): {Information about Greeting} 4.10.6� Text (greeting,prelude): { } 4.11� Text (greeting): { } 4.12� Element 'value' (greeting): 4.12.1� Text (greeting,value): {Hello world!} 4.13� Text (greeting): { } T:\FTEMP>rem copy/paste from IE5 canvas to short.ms.txt T:\FTEMP>type short.ms.txt SHOWTREE Stylesheet file:///T:/FTEMP/short.xml http://www.CraneSoftwrights.com/ resources Root: 1 Proc. Inst. 'xml' (): {version="1.0"} 2 Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 3 Proc. Inst. 'pi1' (): {value of pi1} 4 Comment (): {comment number 1} 5 Element 'greeting' (): 5.1 Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 5.2 Attribute 'attr2' (greeting): {Value of second attribute} 5.3 Element 'purpose' (greeting): 5.3.1 Text (greeting,purpose): {To convey a greeting} 5.4 Element 'test1' (greeting): 5.4.1 Text (greeting,test1): {A test} 5.4.2 Element 'test2' (greeting,test1): 5.4.2.1 Text (greeting,test1,test2): {of depth} 5.4.2.2 Element 'test3' (greeting,test1,test2): 5.4.2.2.1 Text (greeting,test1,test2,test3): {to three levels} 5.5 Comment (greeting): {comment number 2} 5.6 Element 'prelude' (greeting): 5.6.1 Attribute 'id' (greeting,prelude): {start} 5.6.2 Attribute 'xmlns:info' (greeting,prelude): {http://www.CraneSoftwrights .com/resources/} 5.6.3 Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 5.6.4 Element 'info:detail' (greeting,prelude): 5.6.4.1 Text (greeting,prelude,info:detail): {Information about Greeting} 5.7 Element 'value' (greeting): 5.7.1 Text (greeting,value): {Hello world!} T:\FTEMP>type short.ms SHOWTREE Stylesheet file:///T:/FTEMP/short.xml http://www.CraneSoftwrights.com /resources/ Root: 1 Proc. Inst. 'xml' (): {version="1.0"} 2 Proc. Inst. 'xml-stylesheet' (): {type="text/xsl" href="showtree.msxsl"} 3 Proc. Inst. 'pi1' (): {value of pi1} 4 Comment (): {comment number 1} 5 Element 'greeting' (): 5.1 Attribute 'pub-id' (greeting): {+//ISBN 1-894049::CSL::Samples::SHOWTREE//Document SHOWTREE Test//EN} 5.2 Attribute 'attr2' (greeting): {Value of second attribute} 5.3 Element 'purpose' (greeting): 5.3.1 Text (greeting,purpose): {To convey a greeting} 5.4 Element 'test1' (greeting): 5.4.1 Text (greeting,test1): {A test} 5.4.2 Element 'test2' (greeting,test1): 5.4.2.1 Text (greeting,test1,test2): {of depth} 5.4.2.2 Element 'test3' (greeting,test1,test2): 5.4.2.2.1 Text (greeting,test1,test2,test3): {to three levels} 5.5 Comment (greeting): {comment number 2} 5.6 Element 'prelude' (greeting): 5.6.1 Attribute 'id' (greeting,prelude): {start} 5.6.2 Attribute 'xmlns:info' (greeting,prelude): {http://www.CraneSoftwrights.co m/resources/} 5.6.3 Proc. Inst. 'pi2' (greeting,prelude): {value of pi2} 5.6.4 Element 'info:detail' (greeting,prelude): 5.6.4.1 Text (greeting,prelude,info:detail): {Information about Greeting} 5.7 Element 'value' (greeting): 5.7.1 Text (greeting,value): {Hello world!} T:\FTEMP> -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 21:32:39 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:12 2004 Subject: PITarget uniqueness References: <001701be8151$451f3600$7ac2c6c3@uplanet.com> <01df01be822b$0fd3dca0$0300000a@cygnus.uwa.edu.au> Message-ID: <370E55C2.A38B4889@locke.ccil.org> James Tauber wrote: > PI targets can be associated with a URI using the NOTATION mechanism. This > is so much like the namespace mechanism that I'd really like it if the two > were merged and the PI target made arbitrary. I have a new SAX parser filter in the planning stages called PIEngine. It will take three actions w.r.t. PIs, selectable by mode switches: notation resolution: replace any PI target with the URI declared in the corresponding notation declaration, provided there is one. character references: convert numeric character references in PI data to the corresponding characters pseudo-elements: decode PIs as if they contained attributes (a bare name is interpreted as name="name", as in SGML) and pass them as empty elements with "?" prefixed to the name. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Fri Apr 9 21:58:28 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) In-Reply-To: (gkholman@CraneSoftwrights.com) References: Message-ID: <199904091958.OAA02051@bruno.techno.com> [Ken Holman:] > ...alternate strategies are required with > different products when accessing components of > an XML document because the node structure > interpreted by engines differ. Thanks, Ken, for sharing this evidence of the importance of the Megginson committee's work. If these different products had had a standard model of parsed XML to work from, we could now say that XT and/or IE5 fail to conform to the standard, and in exactly what way(s) they fail to conform. Their respective owners would have to take responsibility for their incompatibility with one another. When the Megginson committee's work is done, I hope every XML product and system will conform to the consequent Recommendation as quickly as possible. I hope that developers will simply refuse to work with systems that don't conform. It's certainly in the best interests of every developer (and of his/her customers) to demand nothing less than strict conformance to the standard Recommendation regarding how XML resources appear as node trees after they have been parsed. (Shhh. Did someone say "Grove"?) -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 9 23:03:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) References: <2F2DC5CE035DD1118C8E00805FFE354C0F36271A@RED-MSG-56> Message-ID: <370E6B1A.1A738516@locke.ccil.org> Chris Lovett wrote: > (2) IE5 gives you all PI's. I don't see why the valid PI for representation in your DOM. What if you want to change the > encoding and save the document back out ? The XML declaration and text declaration look like PIs to SGML engines, but they aren't, per the XML rec. They don't have the same syntax, occurrence constraints, etc. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Apr 10 01:11:39 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:11:12 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL Message-ID: Hi Didier, > > Why do you convert the hierarchical database into XML? Is it > because the > hierarchical database do not have a DOM interface? > Not quite sure what you mean. Do you mean, why did we go via XML before we transformed with XSL? Or do you mean, why is there a separation between the DOM and the database? I'll tell you what we're currently doing and see if it helps. The hierarchical database is just good old SQL Server at root, with a few layers on top, so there is no tight integration between a DOM and the database. In the version you're seeing we actually make the tags by string concatenation! It was all written 8 months ago, so go easy on us. In the newer (unreleased) version we read the data from the database and use it to populate a DOM tree, and then either transform it with XSL and export the result, or just pass it through. So if your question is the first - why go out to XML before bringing it back in again to transform with XSL - then it's just because it's old code and the new code does go straight into a DOM. On the other hand, if you're implying that we just treat the database as one big DOM and transform the nodes we want out, then I have to ask, has anyone done that? Are there actually any databases out there that hide behind a DOM interface and present themselves as one big tree of nodes? I've done just that at the level of treating a web server as a great big node store and using XQL to dig out 'documents' - that's how the next release of the magazine will work - but it would be really interesting if the DOM/DB integration has been developed further. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Apr 10 03:24:00 1999 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) References: <199904091958.OAA02051@bruno.techno.com> Message-ID: <370EA772.6273@hiwaay.net> Steven R. Newcomb wrote: > > (Shhh. Did someone say "Grove"?) Someone did. From the XML-Data schema design http://www.w3.org/TR/1998/NOTE-XML-data-0105/ under the topic "What Datatype's URI Means" "Input to the parser is the element object exposing all its attributes and content tree (that is, the subtree of the grove beginning with the element containing the dt attribute). The objectType attribute in particular is assumed to be available to the parser so that single parser can support several objecttypes." Amazing. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sat Apr 10 03:55:30 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:12 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: Message-ID: <005801be82f4$37aeb460$1b19da18@ne.mediaone.net> Mark Birbeck wrote: > > On the other hand, if you're implying that we just treat the database as > one big DOM and transform the nodes we want out, then I have to ask, has > anyone done that? Are there actually any databases out there that hide > behind a DOM interface and present themselves as one big tree of nodes? > I've done just that at the level of treating a web server as a great big > node store and using XQL to dig out 'documents' - that's how the next > release of the magazine will work - but it would be really interesting > if the DOM/DB integration has been developed further. > Well, for example ODI's eXcelon is basically a DOM interface on an object db. When you say you are using XQL on top of a web server, do you mean that the XQL engine is interfaced with the file system? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Sat Apr 10 04:50:07 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:12 2004 Subject: problem with IE5 Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F36271B@RED-MSG-56> You need to delcare the namespace ]> 123 -----Original Message----- From: Richard Tobin [mailto:richard@cogsci.ed.ac.uk] Sent: Friday, April 09, 1999 6:59 AM To: xml-dev@ic.ac.uk Subject: problem with IE5 Betty Harvey sent me mail about a document which was accepted by RXP but rejected by IE5. Here is a small example which shows the problem: ]> It produces this error in IE5: Reference to undeclared namespace prefix: 'foo'. Line 6, Position 1 It doesn't make any difference if I put a namespace declaration for foo on the test element. It looks as if IE5 somehow expects namespace prefixes in the DTD to be declared. Can anyone explain this? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Sat Apr 10 05:08:27 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:11:12 2004 Subject: Illustration of Different Node Trees for XML (W3C(XT)/IE5) In-Reply-To: <2F2DC5CE035DD1118C8E00805FFE354C0F36271A@RED-MSG-56> Message-ID: At 99/04/09 12:11 -0700, Chris Lovett wrote: >(1) IE5 gives you the option on text nodes between elements. These are >called "ignorableWhiteSpace" and we have a switch on the IXMLDOMDocument for >this. >From the perpective of a stylesheet writer (me!), WD-XSL-19991218 states the default for the source tree is all whitespace element content is significant and the default for the stylesheet tree is XSL-type-ignoreable element content whitespace is ignored (neither of which appears to be true for IE5-XSL) ... if I counsel someone regarding portable stylesheets, the WD describes the behaviour I expect, so the fact that I may be able to manipulate the DOM (which I can't with Working Draft 2) doesn't help me here. >(2) IE5 gives you all PI's. I don't see why the valid PI for representation in your DOM. What if you want to change the >encoding and save the document back out ? According to REC-XML production [23], the XML declaration is just that, and *not* a processing instruction. The stylesheet can expose processing instructions, not the XML declaration. Again you mention the DOM ... I'm talking about portable stylesheets and the document structure presented to me and my customers with our stylesheets. >(3) Again, namespace attribute are attributes. What if you want to promote >a namespace declaration to a higher level in the tree ? Again you are speaking of someone manipulating a DOM ... according to WD-XSL-19991218 section 2.4.4, attributes whose name starts with "xmlns:" create a namespace node, not at attribute node, and my stylesheet was written to expose attribute nodes, not namespace nodes ... hence I feel the behaviour witnessed is not correct. >(4) Well this all points out the fact that the DOM group refused to consider >namespaces in level 1. IE5 gives you the namespace info as separate >properties called "namespaceURI" and "prefix" rather than inventing a new >node name format. This way someone can add a namespace to a document >without breaking an app that is already written to the nodeName as a simple >GI. This give a better migration story. Indeed for one manipulating a DOM for later emission as a new document ... but that wasn't my perspective. I was analyzing the state of a node tree as presented to an XSL stylesheet writer ... not analyzing what I could and could not do with a node tree in general. Thank you for taking the time to share your observations ... I hope I've clarified my own goals in my analysis ... I felt it important to share with others who may be writing stylesheets who should be considering the portability of stylesheets and their behaviours (some particularly strongly held feelings on my own part). ........... Ken -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sat Apr 10 23:10:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:12 2004 Subject: Namespaces, prefixes, and URIs In-Reply-To: <370EA5BB.29B586C8@eng.sun.com> from "David Brownell" at Apr 9, 99 06:13:31 pm Message-ID: <199904102111.RAA03224@locke.ccil.org> David Brownell scripsit: > I didn't see the words "deserve to lose" in the spec. What I > saw was a statement that if you use namespaces (TBS) your docs > might need to change. The XML 1.0 spec has not been revised > (incompatibly) to require conformance to the namespace spec. It says: # [A]uthors should not use the colon in XML names except as part of # name-space experiments[.] So there is a SHOULD NOT prohibition against bogus colons of the first kind (non-namespace). Bogus colons of the second kind (namespace experiments) may indeed have to be changed, but why must the DOM support non-W3 namespace experiments? The uses of colons in namespaces are now prescribed, and other uses are proscribed. > In general, my background makes me believe that incompatible > changes to foundational specs (e.g. XML 1.0) are, as a rule, > pure unadulterated evil. > > One never quite knows what could get broken by such changes ... > though it's usually "someone else's software" not your own. > Now there's a dynamic to think about. You're right in general. But the XML Rec goes to some pains to say that uses of : are either deprecated or may have to be changed, putting people on notice pending the namespace Rec. > Didn't namespaces break architectural forms compatibility? > Minimally the PI doesn't conform. The alternative form works fine; I'll get the XAF maintainers to add support for it. (Bill?) > And aren't there SGML applications using colons as name > characters? Perhaps permitting such SGML systems to > migrate to XML isn't "really important" any more. Are colons NAMESTRT characters in the reference concrete syntax? -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sat Apr 10 23:58:33 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: Competition: fast XML parser for charset labelling Message-ID: <370F7DCF.31DC121D@w3.org> Competition, for all those XML coders with copious free time: Write an incredibly small and fast special-purpose XML parser whose purpose is to locate and read the xml declaration; don't bother reading any more of the file once this is found. It should however cope with any well-formed XML instance. Deduce, from the encoding declaration, what the charset label should be (the same value, if the encoding declaration is present). If there is no encoding declaration, apply the rules in the XML 1.0 Recommendation to determine whether UTF-8 or UTF-16 was used. Thus, there will *always* be a charset label at this point. Cache the result in a persistent way; for example, in a file called .charset-'basename'. Integrate this into the Apache mod_mime (or other suitable place) so that, for all resources which mod_mime declares to be of type text/xml, the cached result is automatically used to output a MIME type header Content-type: text/xml; charset="xxx" Where xxx is the cached result, provided the cached result exists and is not older than the datestamp of the xml file; otherwise, refresh the cached result. A special prize bundle of great rarity and desirability will be sent to the lucky winner. MURATA Makoto wrote: > Chris Lilley wrote: > > The best way to ensure this is to > > treat the XML encoding declaration as the prmary metadata resource and > > to programatically derive the charset parameter from this; greater > > If it is done when the document is stored in the WWW server, that is > superb. Once this exists, and is well tested, the next step will be to get it into the base distribution and installed by default. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sat Apr 10 23:58:59 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904070539.AA00211@archlute.apsdc.ksp.fujixerox.co.jp> Message-ID: <370F7DD6.90D3A560@w3.org> MURATA Makoto wrote: > Chris Lilley wrote: > > An alternative method for achieving the same result is to use a filter > > (this can be done in Apache and in Jigsaw) which automatically emits the > > correct charset parameter based on reading the encoding declaration in > > the XML instance. This can easily cache its results, and need not > > result in processing overhead on each request. > > I strongly agree. This is the best approach. I sincerely hope that such > an attempt will happen at W3C. I have spoken to the Jigsaw team about this, explained the urgency, and hope to see an implementation in a forthcoming Jigsaw release. They said it was about an hours work or so. > > > At *IETF*, the default of the charset parameter for text/HTML *is* 8859-1. > > > > Yes, which is different to the default for text/* - this demonstrates > > that it is possible to give a more specific rule for a particular > > registration. > > Actually, in the case of HTTP MIME, the default of the charset parameter of > text/* is always ISO-8859-1. Yes, we both agree there. And I said that this shows that the default for a particular registration can be different from the default for > In the case of real MIME, the default of > the charset parameter of text/* is always US-ASCII. I don't think we need to get into "real MIME" versus "HTTP MIME" here; I raised the issue very early on on the IMC list and quickly got concensus that the MIME registration applies to all uses of MIME. By drawing this distinction, are you saying that RFC 2376 does not apply to HTTP and only applies to email? > text/xml is an exception, since the default is always US-ASCII. This was > recommended by ISEG. Well, if a US-based group recommends US-ASCII that should not really be a surprise ;-) However, while US-ASCII is compatible with UTF-8 it is not the same; and it is not compatible with UTF-16. So, it is a very odd choice for a default. As I said, I regarded a better default to have been the same default as specified in the XML Recommendation. While priority rules can always be defined to figure out which of two conflicting labels (or label defualts) has precedence, the whole issue is solved if the defaults are the same. Unfortunately, RFC2376 did not do this. > > > It is going to be very difficult or > > > impossible, since HTTP and MIME people will disagree. > > > > I think you mean, HTTP and Mail(SMTP/IMAP/POP). MIME is used by both > > email and HTTP. > > HTTP MIME is not quite the same as real MIME. There are many differences > between the two. Since HTTP is at a different (lower) position in the IETF standards track to MIME, MIME cannot make any reference to HTTP but can only speak of email. This is odd, but there we are. So, HTTP has to refer to MIME, noththe other way round; the use of MIME in HTTP uis defined in the HTTP specs. This is unfortunate, but does not make it "unreal". > text/xml has to be consistent with HTTP and MIME. Autodetection > or the use of META tags as the default of the charset parameter has been > extensively discussed by HTTP people and MIME people. They strongly dissent. In another thread, it was convincingly shown that the term "autodetection" to refer to the encoding declaration in the XML Recommendation was incorrect terminology. It is actually using a designating sequence. > > But, if it is not present, > > then the XML Rec says exactly what should happen; > > Appendix F is non-normative. Yes, but I was not referring to Appendix F. I was referring to section 4.3.3 which is normative: http://www.w3.org/TR/REC-xml#charencoding Parsed entities which are stored in an encoding other than UTF-8 or UTF-16 must begin with a text declaration containing an encoding declaration [...] > RFC2376 supercedes it, as intended by the XML WG. Supercedes Appendix F, or superceeds the whole of the XML Recommendation? I assume you mean the former. So, all parsed entities which are not in UTF-8 or UTF-16 must still genin with an encoding declaration; it is an error for them not to do so, and it is an error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration. All of which follows from the normative section 4.3.3 which is still, as far as I am aware, the current XML 1.0 Recommendation. > XML 1.0 cleary says: In Appendix F, which as you point out is non-normative. > By the way, now that RFC 2376 is publisehd, XML 1.0 will be revised. I can't just now conform to a potential future revision of a Recommendation. > >carefull wording which > > this RFC nullifies. Problems arise if an XML file is saved from the Web > > to a local filesystem, perhaps for further editing; the MIME charset > > information is lost. It could perhaps be stored in some way - but, there > > is already a standard way - the XML encoding declaration. > > Since it is a standard way, RFC 2376 recommends recipient programs to > rewrite encoding declarations. OK. It would be better if no rewrites were ever necessary, however. That would have been possible, with suitable wording in RFC 2376. If the MIME charset parameter was *always* derived from the encoding declaration, as I have suggested, then a) it would never disagree b) it would always be correct, when saved to local file, without rewriting > > And if the charset parameter is present, then it should say the same > > thing as the encoding declaration. > > This disallows code conversion by proxy servers. No, it does not, any more than your proposal disallows saving to local file. Your proposal requires rewriting the encoding declaration when saving to file (but not by a proxy); my proposal requires rewriting the encoding declaration when passing through a transcoding proxy (but not when saving locally). Since I observe that saving to file is a very common operation; since I observe that there is existing XML client code deployed, and since a transcoding proxy is rewriting all the bytes in the file anyway, rewriting the encoding declaration is not a significant burden for the proxy. > One could argue > that proxy servers should rewrite encoding declarations. Yes, I am doing so. > However, > documents should not be rewritten for security reasons. Your security argument is self defeating here: a) it imples, don't use transcoding proxies because they trewrite documents b) it implies, saving to local file (which you want to require rewriting the encoding) should be banned for security reasons; I don't see how that could be enforced c) a cryptographic hash or digital signature will be broken by any transcoding proxy So, if security is important and availability of a resource in multiple encodings is important, it follows that the conversions should be done one time on the server, the results signed and cached, and that as part of this process the encoding declaration should be correctly rewritten on the server before the document is signed. > Moreover, > if we require different code conversion for different subtypes of text, > there is not much hope for interoperability, Uh, if something is capable of readinga MIME parameter to find out the charset, it is equally capable of reading the MIME subtype > especially because fallback to text/plain is required. Fallback to text/plain is overrated and rarely useful, as others have noted. You seem to be sacrificing a lot of other things, just to accomodate it. > > The best way to ensure this is to > > treat the XML encoding declaration as the prmary metadata resource and > > to programatically derive the charset parameter from this; greater > > If it is done when the document is stored in the WWW server, that is > superb. Yes, that seems the best way. I notice that the Apache 1.3 distribution has a mod_mime_magic which can be used, perhaps, to do this sort of thing. However, it doesn't seem to do cacheing, and involves server CPU load on a per-hit basis. It also seems fragile, sinc eit relies of fixed byte positions in the file. http://www.apache.org/docs/mod/mod_mime_magic.html There must be a better solution, which only computes the charset once, and only recomputes if the document has changed. > > However, I will point out that it is the consensus of the XML 1.0 > > Recommendation that I am respecting - and that the RFC does not, by > > altering the meaning of the default encoding. It could have been > > harmionised with the XML REC; it was not. > > RFC 2376 IS the consensus (it was not unanimous, though). It is based > on really extensive discussion at the XML SIG and XML WG. My mail > folder named text/xml has 687 e-mails ;-( Larry Masinter (the HTTP WG > chair) and Martin Duerst (the I18N IG chair) was heavily involved. On > the other hand, appendix in XML 1.0 is merely informative and was meant > to be replaced by the XML media type RFC. > > Cheers, > > Makoto > > Fuji Xerox Information Systems > > Tel: +81-44-812-7230 Fax: +81-44-812-7231 > E-mail: murata@apsdc.ksp.fujixerox.co.jp > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sat Apr 10 23:59:41 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <001f01be811f$bb74b5f0$f7d45dc7@eps.inso.com> Message-ID: <370F9F35.2A0E63E0@w3.org> Gavin Thomas Nicol wrote: > > > >An alternative method for achieving the same result is to use a filter > > >(this can be done in Apache and in Jigsaw) which automatically emits the > > >correct charset parameter based on reading the encoding declaration in > > >the XML instance. > > > > I think this is the approach that, ultimately, we all are > > hoping will be deployed. > > I'm not keen on this approach, but it would be a step in the right > direction. I have some servlets for doing this, and for also handling > the *.mim type. Can you post some URIs? Are you willing to share them? I would trust your servlets to be doing the right thing. > I still dislike the encoding information in the PI.... (it isn't, in theory, a PI although it looks exactly like one) I am of quite the opposite point of view - I think that it finally gives authors the ability to correctly label their documents. Furthermore, instead of the rather weak HTML equivalent, it is normative. Great. > as you noted > there are 3 levels at which mistakes can be made, The same is true of any label. The encoding declaration in the XML declaration at least always travels with the document, which is always handy for ensuring metadata doesn't get lost. > and the PI means > that you might have to "fix" the document in the face of transcoding. But if you are transcoding, you have to fix it anyway - so? Similarly if you were to do any other manipulation of the document that altered some declaration, such as collecting often-used strings and replacing them with external entity references - you would have to change the standalone pseudo-attribute if it said "yes", to "no". -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 00:01:01 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: Megginson and XMLNews References: <3.0.5.32.19990407163917.00bd2a60@corp> Message-ID: <370FB644.6F92D093@w3.org> Walter Underwood wrote: > Why instead of an xml:lang attribute? Hopefully the answer is not "because HTML did it that way". > The ISO 8601 subset for is a different subset than > the web profile of ISO 8601 recomended by the W3C. Any chance > of changing to the W3C profile? Particularly since that profile was submitted by Reuters, who do the odd spot of news reporting and presumably it meets their needs. > The element does not offer the date in a parseable > format. #PCDATA is fine for the printed version of the date, but > it also should be given in an ISO 8601 form (see above), and if > I get to choose, I'd rather see it as an element than as an > attribute. I would rather see it as an attrinbute, and have the human-readable date as content. And hope that some future schema syntax allows the attribute to be declared of type 8601date, or something. > Is there some reason why #FIXED wasn't used to make > follow the Xlink draft? That is: > > xlink:form CDATA #FIXED "simple" > href CDATA #REQUIRED> I use this approach myself; I define a couple of parameter entities: for the two most common, simple link, uses. > Since doesn't contain the thing it is a pronunciation > of, should it always follow that thing? And should that be noted > in the spec? It would seem that there should be a container relationship; ideally, a container with two children, one of which holds the text and the other of which holds the phonetic (IPA) text. This is similar to Ruby, which is of course hardly surprising. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 00:01:06 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: Megginson and XMLNews References: <2F2DC5CE035DD1118C8E00805FFE354C0F362704@RED-MSG-56> Message-ID: <370FBD2B.7A847F27@w3.org> Chris Lovett wrote: > > Oh, I bet the thinking this is an XML document. I will file a bug against IE5. Yes, it does do that. I recently gave a presentation that included links to dtds and xml examples (named foo-xml.txt, and so on) so i could show the source. These worked fine as local file references, but when they were put on a server, and served as text/plain, IE5full gave me an error "unable to display XML document with stylesheet". > From: David Megginson [mailto:david@megginson.com] > Actually, www.xmlnews.org is serving it up as text/plain -- MSIE must > be acting on the .dtd extension. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Apr 11 00:21:47 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:13 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: Message-ID: <001b01be834d$0e625b80$e502cdd1@total.net> HI Mark, Thanks for the answers, this help to understand your experience. To the question: treat the database as one big DOM and transform the nodes we want out, then I have to ask, has anyone done that? Are there actually any databases out there that hide behind a DOM interface and present themselves as one big tree of nodes? Yes and no. Yes there is implementations like Excelon that show the whole hierarchical database as a big DOM so that there is less steps in the publication process. process 1 (without DOM DB) RDB ----> XML ----->DOM------>XSL------> HTML process 2 (with DOM DB wrapper) RDB---->DOM---->XSL----->HTML process 3 (directly on DOM) DOM ---->XSL---->HTML It seems that tools like Excelon are targeted to process 3 and later on to process 2. I personally did some benchmark and found a huge increase of performance with model 3. However, There is an impedance mismatch between the RB model and DOM model. The former is based on an array (more particularly an associative array) and the latter as a tree (a tree could be defined as an associative array of associative array). When a thin wrapper is created on RDB to present a DOM facade, the speed is about as respectable as other dynamic page creation like ASP. Other mechanism are lagging in performance behind ASP (or anything like it). So, for server side processing, the DOM or any hierarchical DB interface will work. In fact, the XSL could be adapted to work on other hierarchical interfaces. For instance, actually, the directory services are hierarchical databases. A XSL processor could be implemented on the ADSI interface. So, I found that on the server side, we should talk more of hierarchical db then of XML. XML being a serialized format and the hierarchical database a model than can be processed by languages (either procedural or like XSL). Bottom line, after several years of the associative array model dominance, the hierarchical model comes back. So on the server, its more a question of database model and interfaces that languages understand to do processing on this hierarchical DB. DOM is one of them. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: Mark Birbeck [mailto:Mark.Birbeck@iedigital.net] Sent: Friday, April 09, 1999 7:13 PM To: 'XML Dev'; 'Didier PH Martin' Subject: RE: Comments Appreciated on Magazine Based on XML/XSL Hi Didier, > > Why do you convert the hierarchical database into XML? Is it > because the > hierarchical database do not have a DOM interface? > Not quite sure what you mean. Do you mean, why did we go via XML before we transformed with XSL? Or do you mean, why is there a separation between the DOM and the database? I'll tell you what we're currently doing and see if it helps. The hierarchical database is just good old SQL Server at root, with a few layers on top, so there is no tight integration between a DOM and the database. In the version you're seeing we actually make the tags by string concatenation! It was all written 8 months ago, so go easy on us. In the newer (unreleased) version we read the data from the database and use it to populate a DOM tree, and then either transform it with XSL and export the result, or just pass it through. So if your question is the first - why go out to XML before bringing it back in again to transform with XSL - then it's just because it's old code and the new code does go straight into a DOM. On the other hand, if you're implying that we just treat the database as one big DOM and transform the nodes we want out, then I have to ask, has anyone done that? Are there actually any databases out there that hide behind a DOM interface and present themselves as one big tree of nodes? I've done just that at the level of treating a web server as a great big node store and using XQL to dig out 'documents' - that's how the next release of the magazine will work - but it would be really interesting if the DOM/DB integration has been developed further. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Apr 11 00:24:22 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:13 2004 Subject: Megginson and XMLNews In-Reply-To: <370FB644.6F92D093@w3.org> References: <3.0.5.32.19990407163917.00bd2a60@corp> <370FB644.6F92D093@w3.org> Message-ID: <14095.52616.339600.100498@localhost.localdomain> Chris Lilley writes: > Walter Underwood wrote: > > > Why instead of an xml:lang attribute? (And many other very good questions.) I will look into the mismatch in the ISO 8601 profile; otherwise, however, XMLNews-Story is designed to be subset-compatible with the XML version of NITF. Many people -- mostly leading technical specialists in the news industry -- have put a lot of careful work into NITF (n?e UTF) over coming on a decade now, and we didn't see any good reason to split the market by introducing a competing format; instead, XMLNews provides a document type that is fully subset-compatible with a specific version of NITF, an alternative method for metadata exchange (piggybacking on RDF and Namespaces) that will work with non-XML news objects as well as textual news stories, and a lot of freely-redistributable documentation (the XMLNews site even includes a blanket permission statement for publishers -- I think that all of the free-software writers on this list have occasionally grown tired of signing release forms). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Apr 11 00:27:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:13 2004 Subject: Competition: fast XML parser for charset labelling In-Reply-To: <370F7DCF.31DC121D@w3.org> References: <370F7DCF.31DC121D@w3.org> Message-ID: <14095.53144.469004.18228@localhost.localdomain> Chris Lilley writes: > Competition, for all those XML coders with copious free time: > > Write an incredibly small and fast special-purpose XML parser whose > purpose is to locate and read the xml declaration; don't bother reading > any more of the file once this is found. > It should however cope with any well-formed XML instance. Don't think you'd need much time -- if the document starts with the characters " Hi, Have you any opinions on how fancy an XML-based file system should be in terms of performing validation on datatypes in the DTDs. So far I've been leaving the validation routines out- tho' the file system itself could handle a lot of data-handling for applications that depend on it I think. In other words, supposeing you were using a machine with an XML-based file system and XML-aware OS... is there any reason you can think of why it would be wise to employ datatype validation or any other types of data integrity checkes? Thomas 0@0000000.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 03:27:46 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <002901be802d$24f65680$27f96d8c@NT.JELLIFFE.COM.AU> <370E2517.6E59D3AF@locke.ccil.org> Message-ID: <370FF7C0.764F3798@w3.org> John Cowan wrote: > > Rick Jelliffe wrote: > > However, it is all spoiled if there are systems which corrupt the > > labels: for example by rewriting the charset parameter incorrectly. It > > is far better to send the XML file without a charset parameter than to > > send it with a wrong one. Yes. But even better to send it with a correct one. This is easily done; just ensure that the server always sends the same charset that the XML encoding declaration specifies. > But there's the snag: in text/xml documents, a missing charset parameter > does not mean "Charset unspecified"; it means "Charset specified > as US-ASCII". This is correct, the RFC does say that. Note that, this thread is primarily about whether the RFC *should* say that or *should* say something different, something which does not needlessly contradict the XML 1.0 Recommendation. > There is no way to fail to specify a charset in > text/* documents, and rightly so, because text without a charset > is uninterpretable. This is disingeneous; both clauses are true, but the second one implies that there is no other method of conveying the information, which, clearly, there is. So a) There is no way to fail to specify a charset in text/* documents But it does not have to be explicit. It can be implied. good way of formalising that implication would be to refer to the rules in the XML 1.0 Recommendation. b) text without a charset is uninterpretable. Also true, but that labelling is already defined in XML and handily trravels with the document instance so that it is not lost as soon as the document is saved to disk. > In SGML terms, omitting the charset in text/* documents is a mere > minimization, whereas in application/* documents it is a true #IMPLIED. Actually, if you read the XML Recommendation, then unless the charset is UTF-8 or UTF-16, the charset (encoding declaration) is #REQUIRED -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 03:27:50 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904050220.AA00169@archlute.apsdc.ksp.fujixerox.co.jp> <370A0194.C175B421@w3.org> <370E1F91.3CEC8C95@locke.ccil.org> Message-ID: <370FF556.9B0563F7@w3.org> John Cowan wrote: > > Chris Lilley wrote: > > Redundancy can be good; a > > charset parameter and an XML encoding declaration that say the same > > thing and work the same way, which is what I was suggesting, is good. > > Yes, indeed. Nevertheless, the charset parameter has one > advantage over the encoding declaration: it is guaranteed by MIME > to be in ASCII, and thus always readable. Okay, true > A document with a > Content-Type of "text/xml;charset=cp-ebcdic-us" can be affirmatively > rejected by a client that does not understand EBCDIC, whereas a > client which has only the encoding declaration may *suppose* that > the document is EBCDIC, based on the Appendix F heuristics, but cannot > *know*. On the contrary; it can suppose that, but having made that supposition it can check it. It can parse the document, using the cp-ebcidic-us conversion from bytes to characters, and having done so it can look in the XML declaration for an encoding declaration, and one of two things happen: 1) right there, it says encoding="cp-ebcidic-us" 2) it doesn't, so it halts with a fatal error. Please note that I am describing normative behaviour, not non-normativce behaviour. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 03:27:53 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:13 2004 Subject: IE5.0 does not conform to RFC2376 References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org> <3701411C.784A1D4A@eng.sun.com> <37076818.54025D7B@w3.org> <3707C75D.C303FC59@eng.sun.com> <370807CC.CC9F0E1D@w3.org> <3708314B.B98240BE@eng.sun.com> <370A2606.EB895110@w3.org> <370E2901.548FD26B@locke.ccil.org> Message-ID: <370FF931.356750F4@w3.org> John Cowan wrote: > > Chris Lilley wrote: > > > MIME actually need not have those constraints; *email* has those > > constraints (although increasingly it does not, in practice). HTTP is > > always 8-bit clean. I agree that the MIME RFCs have steadfastly tried to > > pretend that MIME is an email-only thing. > > The requirement for a charset with text/* documents (whether minimized > or not) has nothing to do with email, although the fact that a > minimized charset parameter means "US-ASCII" surely does. We agree here > A charset must be specified for interoperable text (as opposed to > application-specific formats) because text means nothing if you > do not know the charset, We also agree here > and no text-using application can fail > to either decode, guess, or simply presume some charset. And here. Wonderful. So, given some random text page, with no other labelling, then yes, you would need to guess. And there are ways to do so. No way to tell if you guessed right. With XML, having made a determination of what the encoding is, it can instantly be verified, because the encoding declaration is required. So, the label is there all along, is absolutely required to be there otherwise it is a fatal error, and all that is needed is a little bootstrapping. This is clearly not the same thing as the previous case, at all. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Apr 11 04:51:27 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:13 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <370FF556.9B0563F7@w3.org> from "Chris Lilley" at Apr 11, 99 03:05:26 am Message-ID: <199904110252.WAA19554@locke.ccil.org> Chris Lilley scripsit: > On the contrary; it can suppose that, but having made that supposition > it can check it. It can parse the document, using the cp-ebcidic-us > conversion from bytes to characters, and having done so it can look in > the XML declaration for an encoding declaration, and one of two things > happen: Remember that I spoke of a client that *cannot* understand EBCDIC (because it has no conversion table, say), but perhaps has the 4-byte heuristic from Appendix F. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Apr 11 04:57:30 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:14 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) In-Reply-To: <370FF7C0.764F3798@w3.org> from "Chris Lilley" at Apr 11, 99 03:15:44 am Message-ID: <199904110258.WAA20889@locke.ccil.org> Chris Lilley scripsit: > > There is no way to fail to specify a charset in > > text/* documents, and rightly so, because text without a charset > > is uninterpretable. > > This is disingeneous; both clauses are true, but the second one implies > that there is no other method of conveying the information, which, > clearly, there is. If you know the text is XML, then you can determine the charset by Appendix F methods; but if "text/xml" is just "text/*" to you, then you must be able to rely on the (possibly minimized) charset parameter in the media type, because if there is no charset, the text/* is, as I said, uninterpretable. > So > a) There is no way to fail to specify a charset in text/* documents > > But it does not have to be explicit. It can be implied. good way of > formalising that implication would be to refer to the rules in the XML > 1.0 Recommendation. I meant that if you are processing a MIME document, as long as you know its major type is "text", you can always determine the charset. There is either an explicit charset parameter, or the implicit charset of either "US-ASCII" or "ISO-8859-1" depending on the underlying transport protocol. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 05:38:00 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:14 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904110252.WAA19554@locke.ccil.org> Message-ID: <37101881.3D296773@w3.org> John Cowan wrote: > > Chris Lilley scripsit: > > > On the contrary; it can suppose that, but having made that supposition > > it can check it. It can parse the document, using the cp-ebcidic-us > > conversion from bytes to characters, and having done so it can look in > > the XML declaration for an encoding declaration, and one of two things > > happen: > > Remember that I spoke of a client that *cannot* understand EBCDIC > (because it has no conversion table, say), but perhaps has the 4-byte > heuristic from Appendix F. Well in that case, it has to throw an error. Which situation will not be improved by adding a MIME charset parameter. If it doesn't understand, it doesn't understand no matter how many times you tell it. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Sun Apr 11 05:43:37 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:14 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <199904110258.WAA20889@locke.ccil.org> Message-ID: <371019D2.5E940D04@w3.org> John Cowan wrote: > > But it does not have to be explicit. It can be implied. good way of > > formalising that implication would be to refer to the rules in the XML > > 1.0 Recommendation. > > I meant that if you are processing a MIME document, as long as you > know its major type is "text", you can always determine the charset. > There is either an explicit charset parameter, or the implicit > charset of either "US-ASCII" or "ISO-8859-1" depending on the > underlying transport protocol. Aha. So, if you know that what the HTTP server sent is text/xml, then no charset parameter means it is US-ASCII, but if you think it is just text/*, then that means it is ISO-8859-1 ? And if you save it to disk and then read it back, no encoding declaration means it is either UTF-8 or UTF-16. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Sun Apr 11 09:11:23 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:11:14 2004 Subject: IE5.0 does not conform to RFC2376 In-Reply-To: <370F7DD6.90D3A560@w3.org> Message-ID: <199904110710.AA00258@archlute.apsdc.ksp.fujixerox.co.jp> Chris Lilley wrote: > > MURATA Makoto wrote: > > I strongly agree. This is the best approach. I sincerely hope that such > > an attempt will happen at W3C. > > I have spoken to the Jigsaw team about this, explained the urgency, and > hope to see an implementation in a forthcoming Jigsaw release. They said > it was about an hours work or so. That is great! I think that further discussion in this mailing list about the justification of the default for the charset parameter is not very useful. The discussion should be moved to the ietf-xml-mime mailing list. The current specification is a result of loooooong discussion. Nobody is completely happy with it, but nobody is completely unhappy with it (rememember that application/xml is also available). In their review report of XML, the W3C I18N WG asked the XML CG not to change the precedence rule of the charset parameter. If I create an I-D ignoring this request, I would be ignoring the I18N WG as welll as strong oppositions from HTTP people. Since I intend to move the discussion to the IETF-xml-mime mailing list, I merely state some facts here. > By drawing this > distinction, are you saying that RFC 2376 does not apply to HTTP and > only applies to email? RFC 2376 quite carefully mentions both HTTP and real MIME. > Well, if a US-based group recommends US-ASCII that should not really be > a surprise ;-) However, while US-ASCII is compatible with UTF-8 it is > not the same; and it is not compatible with UTF-16. So, it is a very odd > choice for a default. IETF I18N guideline documents (RFC 2277 and RFC2130) recommend UTF-8 as the default. When the WWW was invented, 8559-1 was the default. US-ASCII is the intersection of the two. Chris Lilley wrote: > > Yes, but I was not referring to Appendix F. I was referring to section > 4.3.3 which is normative: > http://www.w3.org/TR/REC-xml#charencoding > > Parsed entities which are stored in an encoding other than UTF-8 > or UTF-16 must begin with a text declaration containing an encoding > declaration [...] I agree that this is misleading. It only talks about the case that MIME headers are not available (I will send out a request for clarification). Chris Lilley wrote: > > RFC2376 supercedes it, as intended by the XML WG. > > Supercedes Appendix F, or superceeds the whole of the XML > Recommendation? I assume you mean the former. Yes. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Apr 11 09:36:41 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:14 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL References: <005801be82f4$37aeb460$1b19da18@ne.mediaone.net> Message-ID: <371022E0.123A9590@prescod.net> Jonathan Borden wrote: > > Well, for example ODI's eXcelon is basically a DOM interface on an object > db. That is not my impression. My impression is that Excelon is a persistent cache for DOM trees built on an object database. If you have non-XML data objects it is completely your responsibility to figure out how to build XML or a DOM for those objects. The Excelon architectural diagram shows Excelon acting as a front end to a variety of data stores but my understanding is that the mapping from arbitrary objects and tables to DOM objects is done by integrators and programmers not automatically by Excelon. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Apr 11 12:40:15 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:14 2004 Subject: IE5.0 does not conform to RFC2376 Message-ID: <005101be83ff$a5115bf0$18f96d8c@NT.JELLIFFE.COM.AU> From: John Cowan >Remember that I spoke of a client that *cannot* understand EBCDIC >(because it has no conversion table, say), but perhaps has the 4-byte >heuristic from Appendix F. Just a side comment: In Appendix F of the XML spec, it is not encoding declarations which are non-normative, it is the particular (details of the) algorithm for interpreting them. It is non-normative, because (apart from anything else) it does not attempt to completely cover the full range of possible IANA character sets which could be used with XML: it demonstrates how a processor uses the encoding declaration for some common character encodings. In just the same way as the term "auto-detection" fails to indicate that it is not guesswork, "heuristic" may also suggest to some to some that the Appendix F algorithm involves guesswork. So I certainly prefer "algorithm" to "heuristic". No matter what kind of system for representing differing character encodings (internal, external, defaulted, BOMed, etc) , there is the possibility that encodings will go astray. I think this is a fault of operating systems: without a way to reliably indicate charset and so on for documents, webservers do not have anything to query in order to source the information: I guess that UNIX is the worst player here--Macs have the resource fork and the business operating systems have registries or databases (AS400 can register text as 8-, 16, or 32 bit; however, I don't think it has a way to specify character set beyond that.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sun Apr 11 17:39:46 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:11:14 2004 Subject: PITarget uniqueness References: <001701be8151$451f3600$7ac2c6c3@uplanet.com> <01df01be822b$0fd3dca0$0300000a@cygnus.uwa.edu.au> <370E55C2.A38B4889@locke.ccil.org> Message-ID: <3710C0D4.2288B477@mecomnet.de> i've always had a nagging feeling of uncertainty in the past when i've read the statement which is cited below. when i asked, way back, for example, why one was not allowed to qualify notation names and pi targets, the answer was, that it was expected that they would be bound to URI's. this may well be true, but it doesn't matter. the two mechanisms are behave differently. namespaces wrt element and attribute names are part of a mechanism of the form prefix X local-part -> prefix X URI -> URI X local-Part = universal-name the PI / notation mechanism, on the other hand, is of the form local-part -> local-part X URI -> URI these are different things. they both concern URI's but behave differently wrt name collisions and the extent to which late binding can be used to avoid such. if, for example, two documents are to be combined within which encoded element identifiers and encoded PI targets conflict, late bound prefixes can "fix" the former, but you can't do anything about the latter. one may object, that the notations are "registered", and as such ambiguous, but that's not the problem. it's the pi targets themselves. if you can't map them late, you're effectively registering them and the notation - in particular the uri in the notation - is redundant, and thus superfluous. one may object, that it's actually not a pi-target == local-part, but a pi-target == prefix equivalence which is is at work here, and dynamicaly generated pi-targets solve the problem, but that contradicts the claims made wrt element/attribute identifiers and the need for dynamic prefix/URI bindings. John Cowan wrote: > > James Tauber wrote: > > PI targets can be associated with a URI using the NOTATION mechanism. This > > is so much like the namespace mechanism that I'd really like it if the two > > were merged and the PI target made arbitrary. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Apr 11 17:52:04 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:14 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: <371022E0.123A9590@prescod.net> Message-ID: <000401be8431$4f42e2c0$9d4d8bcf@total.net> Hi Paul, Jonathan is right. We have to go beyond the marketing discourse and play with the product to see that. When I downloaded the product and made some test. It became obvious that what is stored is objects (their own internal structure) stored in a PSE pro database (from the dll included). The hierarchical object data base as a XQL interface and a Java DOM interface. I tried to find the C++ DOM but I am still searching where is it documented and how to connect to the C++ DOM. However the Java DOM is well documented. Thus, these internal objects provide a Java DOM interface. You also have a XML parser that translate from the serialized format to this internal hierarchical object database. The whole structure looks like a directory service because there is several object schemas however, on a second look, it seems that there is no provisions to add your own schemas (this is different to a directory service which is extensible). Thus, it seems that only certain categories of objects could be stored. About the other data stores, it is still vaporware. My guess is that independently of any marketing positioning, there will be either a) drivers that convert external data into DOM objects (do not import/store the external data into the hierarchical object database) b) import and store the data into the hierarchical database. The latter is taking more resources but is faster especially in the case of relational databases located on remote servers. My guess is that they will probably implement this solution. So, yes Jonathan is right. ODI marketing people positioned the tool as a cache. But objectively it is a hierarchical object database with a java DOM interface (maybe C++ if I can find it), XQL query and storage or folder objects used to transfer a serialized document into the hierarchical object database. However, the ODB seems limited compared to directory services where, in the latter, a new schema could easily be added. But, because it seems to use PSE pro, this facility could be added to Excelon if they decide to. But yes, it is a hierarchical object database with a DOM interface. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Paul Prescod Sent: Sunday, April 11, 1999 12:20 AM To: 'XML Dev' Subject: Re: Comments Appreciated on Magazine Based on XML/XSL Jonathan Borden wrote: > > Well, for example ODI's eXcelon is basically a DOM interface on an object > db. That is not my impression. My impression is that Excelon is a persistent cache for DOM trees built on an object database. If you have non-XML data objects it is completely your responsibility to figure out how to build XML or a DOM for those objects. The Excelon architectural diagram shows Excelon acting as a front end to a variety of data stores but my understanding is that the mapping from arbitrary objects and tables to DOM objects is done by integrators and programmers not automatically by Excelon. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sun Apr 11 21:27:22 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:14 2004 Subject: IE5.0 does not conform to RFC2376 References: <199904110710.AA00258@archlute.apsdc.ksp.fujixerox.co.jp> Message-ID: <3710F740.28577A5@eng.sun.com> MURATA Makoto wrote: > > Chris Lilley wrote: > > > > Well, if a US-based group recommends US-ASCII that should not really be > > a surprise ;-) However, while US-ASCII is compatible with UTF-8 it is > > not the same; and it is not compatible with UTF-16. So, it is a very odd > > choice for a default. > > IETF I18N guideline documents (RFC 2277 and RFC2130) recommend UTF-8 as the > default. When the WWW was invented, 8559-1 was the default. US-ASCII is > the intersection of the two. US-ASCII is also what MIME has always specified, even before HTTP was specified to use its variant of MIME. Being an intersection is convenient; but it's also been the standard. The IETF recommented no more (or less) than compatibility with the existing standards. I'd expect any standards body to do that, even one with the substantial international participation of the IETF! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sun Apr 11 22:05:08 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:14 2004 Subject: problem with IE5 References: <2F2DC5CE035DD1118C8E00805FFE354C0F36271B@RED-MSG-56> Message-ID: <37110012.426FBAD8@eng.sun.com> Looks to me like: (a) IE5 uses a nonvalidating XML 1.0 parser (modulo bugs) for documents it tries to display; (b) IE5 however REQUIRES conformance to the namespace spec, and thus rejects some well formed XML 1.0 documents, such as Richard's original; (c) It also REQUIRES any "xmlns*" attributes found in a DTD to be #FIXED (which is good style) and so rejects documents which don't have #FIXED, yet conform to the namespace spec; (d) It also REQUIRES a redundant declaration of such xmlns attributes on elements, even in cases where the XML 1.0 specification requires the #FIXED default to be provided from the processor (and the namespace spec requires it to be used, effectively 'inherited'); (e) It has some other conformance issue, where the namespace declaration on just the "test" element doesn't work. This might be related to the issue (d) above. Chris -- is this basically accurate? - Dave Chris Lovett wrote: > > You need to delcare the namespace > > > > > ]> > 123 > > -----Original Message----- > From: Richard Tobin [mailto:richard@cogsci.ed.ac.uk] > Sent: Friday, April 09, 1999 6:59 AM > To: xml-dev@ic.ac.uk > Subject: problem with IE5 > > Betty Harvey sent me mail about a document which was accepted by RXP > but rejected by IE5. Here is a small example which shows the problem: > > > > > ]> > > > It produces this error in IE5: > > Reference to undeclared namespace prefix: 'foo'. Line 6, Position 1 > > It doesn't make any difference if I put a namespace declaration for > foo on the test element. > > It looks as if IE5 somehow expects namespace prefixes in the DTD to be > declared. Can anyone explain this? > > -- Richard > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Sun Apr 11 23:36:13 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:15 2004 Subject: Namespaces, prefixes, and URIs References: <199904102111.RAA03224@locke.ccil.org> Message-ID: <37111571.E57E9BB5@eng.sun.com> John, I hope you recognize that you've just moved a discussion from a private mailing list to a public XML-DEV one? I'd not mind if such transitions were done with permission, and carried the appropriate context. [ The real issue discussed here has nothing to do with anything in the title any more -- it's just whether to treat the namespace spec as a revision to the XML spec. I'm saying the answer is "no"; namespace rules don't supercede XML 1.0 rules. The spec wasn't written or sold that way. Re an "XML 1.1" or "2.0", no comment. ] John Cowan wrote: > > David Brownell scripsit: > > > I didn't see the words "deserve to lose" in the spec. What I > > saw was a statement that if you use namespaces (TBS) your docs > > might need to change. The XML 1.0 spec has not been revised > > (incompatibly) to require conformance to the namespace spec. > > It says: ... in part ... > # [A]uthors should not use the colon in XML names except as part of > # name-space experiments[.] > > So there is a SHOULD NOT prohibition against bogus colons of the > first kind (non-namespace). Nope -- two reasons. Firstly, the fact that the "first kind" of document I described _conforms fully_ to the namespace spec!! If you are truly disagreeing with the point I made there, you are then arguing that somehow a processor may conform to the XML 1.0 and DOM specs, and yet refuse to process namespace-conformant docs. (Perhaps because it didn't add special support for namespaces?) I hope you aren't saying that. Second point: since the sentence you excerpted begins "in practice", it's clearly not normative. This is emphasized by the fact that it doesn't use normative language, like "MUST NOT". In fact, the second half of the sentence you excerpted (the "[.]") repeats what the XML spec says elsewhere, normatively. Colons are just name characters. An XML 1.0 processor must accept names with colons in them (even ":this:is:an:XML_1.0:name"). No prohibition; to the contrary, if you don't accept names like that, you don't conform to the XML spec. > Bogus colons of the second kind (namespace experiments) may > indeed have to be changed, but why must the DOM support non-W3 > namespace experiments? The uses of colons in namespaces are > now prescribed, and other uses are proscribed. Re that last comment, where would they be proscribed? Clearly not in the XML 1.0 spec. Likewise, not in the namespace spec. And not in the DOM spec. Neither DOM nor XML have been revised (incompatibly) to mandate support only of documents that conform to the namespace spec. I've not heard anyone propose that it be done, either. > > In general, my background makes me believe that incompatible > > changes to foundational specs (e.g. XML 1.0) are, as a rule, > > pure unadulterated evil. > > > > One never quite knows what could get broken by such changes ... > > though it's usually "someone else's software" not your own. > > Now there's a dynamic to think about. > > You're right in general. But the XML Rec goes to some pains to say > that uses of : are either deprecated or may have to be changed, > putting people on notice pending the namespace Rec. It says that if you're using namespaces, documents may need to change -- no more. The language used in the XML and namespaces specs does NOT indicate that namespaces are a requirement. > > Didn't namespaces break architectural forms compatibility? > > Minimally the PI doesn't conform. > > The alternative form works fine; > I'll get the XAF maintainers to add support for it. > (Bill?) And modify the docs, etc. Sounds good. > > And aren't there SGML applications using colons as name > > characters? Perhaps permitting such SGML systems to > > migrate to XML isn't "really important" any more. > > Are colons NAMESTRT characters in the reference concrete syntax? Hmm, going by James Clark's writeup ref'd by the XML spec, no. Now I'm curious how an ISO SGML standard ended up using colons in a way that (to this non-SGML person) seems like it violates the SGML standard! Perhaps the "reference" syntax isn't the only "really important" SGML syntax around. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Apr 12 00:12:55 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:15 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL References: <000401be8431$4f42e2c0$9d4d8bcf@total.net> Message-ID: <37111B95.24D92B4E@prescod.net> Didier PH Martin wrote: > > Hi Paul, > > Jonathan is right. We have to go beyond the marketing discourse and play > with the product to see that. When I downloaded the product and made some > test. It became obvious that what is stored is objects (their own internal > structure) stored in a PSE pro database (from the dll included). I think we've lost too much context here and that is probably my fault. We are probably also talking at cross purposes. Mark Birbeck said: > The hierarchical database is just good old SQL Server at root, with a > few layers on top, so there is no tight integration between a DOM and > the database. and > On the other hand, if you're implying that we just treat the database as > one big DOM and transform the nodes we want out, then I have to ask, has > anyone done that? Are there actually any databases out there that hide > behind a DOM interface and present themselves as one big tree of nodes? My impression of Excelon is that it does not "hide behind a DOM interface" in the sense that I cannot pump in arbitrary objects conforming to arbitrary IDL or DDL schemas and expect them to present a "DOM interface" to the world. Excelon implements a DOM interface to XML documents, not to arbitrary data objects. That doesn't make it a bad product, but I don't think it is the product Mark is describing. I imagine that the (imaginary) product that Mark is describing would allow you to specify your objects in IDL, manipulate them as ordinary object/method/property Java or C++ objects and get a DOM interface to them "for free" when you want it. I don't think that that product exists. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Mon Apr 12 00:44:03 1999 From: harvey at eccnet.eccnet.com (Betty L. Harvey) Date: Mon Jun 7 17:11:15 2004 Subject: problem with IE5 In-Reply-To: <37110012.426FBAD8@eng.sun.com> Message-ID: This was my original test file: ]> This is a test It complains that the HTML namespace has not been declared. The namespace specification doesn't say anything about how the namespace should be declared within the DTD. It seems to me that requiring namespaces is going to cause havoc in implementation and conformance. Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ On Sun, 11 Apr 1999, David Brownell wrote: > Looks to me like: > > (a) IE5 uses a nonvalidating XML 1.0 parser (modulo bugs) > for documents it tries to display; > > (b) IE5 however REQUIRES conformance to the namespace spec, > and thus rejects some well formed XML 1.0 documents, > such as Richard's original; > > (c) It also REQUIRES any "xmlns*" attributes found in a DTD > to be #FIXED (which is good style) and so rejects documents > which don't have #FIXED, yet conform to the namespace spec; > > (d) It also REQUIRES a redundant declaration of such xmlns > attributes on elements, even in cases where the XML 1.0 > specification requires the #FIXED default to be provided > from the processor (and the namespace spec requires it > to be used, effectively 'inherited'); > > (e) It has some other conformance issue, where the namespace > declaration on just the "test" element doesn't work. This > might be related to the issue (d) above. > > Chris -- is this basically accurate? > > - Dave > > > Chris Lovett wrote: > > > > You need to delcare the namespace > > > > > > > > > > > ]> > > 123 > > > > -----Original Message----- > > From: Richard Tobin [mailto:richard@cogsci.ed.ac.uk] > > Sent: Friday, April 09, 1999 6:59 AM > > To: xml-dev@ic.ac.uk > > Subject: problem with IE5 > > > > Betty Harvey sent me mail about a document which was accepted by RXP > > but rejected by IE5. Here is a small example which shows the problem: > > > > > > > > > > > ]> > > > > > > It produces this error in IE5: > > > > Reference to undeclared namespace prefix: 'foo'. Line 6, Position 1 > > > > It doesn't make any difference if I put a namespace declaration for > > foo on the test element. > > > > It looks as if IE5 somehow expects namespace prefixes in the DTD to be > > declared. Can anyone explain this? > > > > -- Richard > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > > 981-02-3594-1 > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > > message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Apr 12 04:42:27 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:15 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: <37111B95.24D92B4E@prescod.net> Message-ID: <000001be848d$19371cd0$1b19da18@ne.mediaone.net> Paul Prescod wrote: > > > My impression of Excelon is that it does not "hide behind a DOM interface" > in the sense that I cannot pump in arbitrary objects conforming to > arbitrary IDL or DDL schemas and expect them to present a "DOM interface" > to the world. Excelon implements a DOM interface to XML documents, not to > arbitrary data objects. That doesn't make it a bad product, but I don't > think it is the product Mark is describing. I imagine that the (imaginary) > product that Mark is describing would allow you to specify your objects in > IDL, manipulate them as ordinary object/method/property Java or C++ > objects and get a DOM interface to them "for free" when you want it. I > don't think that that product exists. > Actually, if you take my XMOP project which serializes COM/IDL described and Java objects into a DOM interface, and bolt it onto eXcelon, this is pretty much exactly what this would do. XMOP uses either Java reflection or COM typelibraries (which are compiled IDL and are close but not quite full fidelity to MIDL itself), and serializes the object into either a DOM or an XML stream. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Mon Apr 12 10:41:18 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:11:15 2004 Subject: The ietf-xml-mime mailing list Message-ID: <199904120838.AA00274@archlute.apsdc.ksp.fujixerox.co.jp> The ietf-xml-mime mailing list is for discussing MIME types for XML. Revisions of RFC 2376 are discussed in this ML. For subscription details, see http://www.imc.org/ietf-xml-mime/ Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Mon Apr 12 11:50:42 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:15 2004 Subject: DOM & Entities Message-ID: <001401be84c9$f18e99a0$ab20268a@pc-lrd.bath.ac.uk> Hi, I'd like some clarification for how DOM handles entities. The spec suggests that entities are resolved into their text equivalent ("...are replaced by the single character that makes up the entities equivalent..."). I'm curious as to how this is handled with entities such as those used in mathematical equations, or accented characters, or other special characters that aren't strictly 'plain text'? I'm writing an XML processing application which reads in an XML document, performs some processing (based on another XML 'rules' document) and then produces a final XML document. Ideally I'd like the entities retained from start to finish, so that I can be sure that they survive the transformation unchanged. But I'm unclear how I can ensure this? Will I have to wrap all entity references in CDATA sections before parsing? Incidentally I plan on using the IBM xml4j parser and its DOM implementation Tips, comments, clarifications welcomed. L. ================================================================== "Never Do With More, What Can Be Achieved With Less" ---William of Occam ================================================================== Leigh Dodds Eml: ldodds@ingenta.com ingenta ltd Tel: +44 1225 826619 BUCS Building, University of Bath Fax: +44 1225 826283 ================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Apr 12 13:01:43 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:11:15 2004 Subject: W3C Web servers Message-ID: <01BE84E4.50E44660@grappa.ito.tu-darmstadt.de> Is it just me, or are other people having trouble getting through to the W3C Web servers? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From claudio.vernacotola at crpht.lu Mon Apr 12 16:12:35 1999 From: claudio.vernacotola at crpht.lu (claudio.vernacotola@crpht.lu) Date: Mon Jun 7 17:11:15 2004 Subject: W3C Web servers Message-ID: > Is it just me, or are other people having trouble getting through to the W3C Web servers? I'm also experiencing problems with W3C Web servers. Access is extremely slow. Regards, Claudio. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Mon Apr 12 16:19:10 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:15 2004 Subject: problem with IE5 In-Reply-To: David Brownell's message of Sun, 11 Apr 1999 13:03:30 -0700 Message-ID: <21172.199904121418@doyle.cogsci.ed.ac.uk> > Looks to me like: > (b) IE5 however REQUIRES conformance to the namespace spec, > and thus rejects some well formed XML 1.0 documents, > such as Richard's original; In what way does my document (below) not conform to the namespace recommendation? It contains no qualified names in the body, and prefixes in the DTD are not required (and not able) to be declared. > > > > > > > > > ]> > > -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Apr 12 16:24:20 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:11:15 2004 Subject: W3C Web servers Message-ID: <01BE8500.C1E66A70@grappa.ito.tu-darmstadt.de> claudio.vernacotola wrote: > I'm also experiencing problems with W3C Web servers. Access is extremely > slow. Just to let everybody know, a number of people have replied privately that they have had slow or no access since Friday. Presumably, the W3C is working on it... -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Mon Apr 12 16:28:02 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:11:15 2004 Subject: Standalone documents as external parsed entities. Message-ID: <37120556.597B67FC@iol.ie> Hi, >From my reading of the XML 1.0 spec, it is not possible to have a standalone document, with it's own DTD, "included" in another document as an external parsed entity. Am I wrong? My situation is that I have a group of standalone documents, each one representing a member of a scientific organization. As well as producing a web page for each of those members, from the standalone documents, I want also to produce a "directory" on a separate page, which lists each of the members, in alphabetical order, by country, etc. The most obvious way to do this is to group the standalone documents in a "list" document, which contains external entity references which refer to each and every one of the standalone documents, and use XSL to order/transform that "list" document. Not being able to have a DTD on external parsed entities makes it difficult to deal with character entities in those parsed entities, for one thing, not to mention attribute handling. Am I missing something? Alan Kennedy. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Apr 12 17:25:12 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:15 2004 Subject: DOM & Entities References: <001401be84c9$f18e99a0$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <004201be84f8$456d4f20$0300000a@cygnus.uwa.edu.au> > I'm curious as to how this is handled with entities such as those > used in mathematical equations, or accented characters, or > other special characters that aren't strictly 'plain text'? Well strictly they *are* plain text. That's the whole point of XML characters being Unicode characters. Accented Latin haracters, Japanese, maths symbols are just as much plain text as a capital A. > I'm writing an XML processing application which reads in an > XML document, performs some processing (based on another > XML 'rules' document) and then produces a final XML document. > Ideally I'd like the entities retained from start to finish, so > that I can be sure that they survive the transformation unchanged. > But I'm unclear how I can ensure this? Will I have to wrap all > entity references in CDATA sections before parsing? A CDATA wrapper wouldn't work because *after* your processing they'd still be in a CDATA section or would be things like é If you absolutely want to have entity references at the end of the day, your safest bet would be to post process the character data and replace any characters you don't want literally with an equivalent. Character references might be an even better solution and certainly this would make the post processing easier. Just run over the text replacing (say) any character > 128 with &#...; James -- James Tauber / jtauber@jtauber.com / www.jtauber.com XML Standards and Product Coordinator HarvestRoad Communications / www.harvestroad.com.au Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Apr 12 18:47:24 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:15 2004 Subject: Standalone documents as external parsed entities. References: <37120556.597B67FC@iol.ie> Message-ID: <3712237C.18BE9009@locke.ccil.org> Alan Kennedy wrote: > >From my reading of the XML 1.0 spec, it is not possible to have a > standalone document, with it's own DTD, "included" in another document > as an external parsed entity. > > Am I wrong? You are right. The nearest you can come in a formal sense is to declare your standalone document as an *unparsed* entity. Note that "unparsed" != "unparsable". Then you need an application framework capable of recursively parsing unparsed entities using XML notation, which AFAIK does not yet exist. As for your specific problem, if your documents conform to random DTDs, you have much worse problems than the rules about external entities. What is a member name? What is a country? Etc. What is the format of your original documents? Can you make them conform to a single DTD, like say XHTML 1.0? XAF (http://www.megginson.com/XAF) may be helpful. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Apr 12 18:52:36 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:15 2004 Subject: Standalone documents as external parsed entities. References: <37120556.597B67FC@iol.ie> Message-ID: <00ce01be8504$3cbdfd00$0300000a@cygnus.uwa.edu.au> > >From my reading of the XML 1.0 spec, it is not possible to have a > standalone document, with it's own DTD, "included" in another document > as an external parsed entity. > > Am I wrong? There are ways around it. But for starters, be careful with the term "standalone" as it means something quite specific in XML (and something different from what I'm guessing you mean by it). If you don't care about the DTD, then you can have: list.xml: ]> &doc1; &doc2; doc1.xml: ... doc2.xml: ... In this case doc1.xml and doc2.xml can be treated both as external parsed entities and as document entities. If you *do* what the individual external entities to have a DTD, you'll have to use a wrapper document. list.xml: ]> &doc1.content; &doc2.content; doc1.xml: ]> &doc.content; doc2.xml: ]> &doc.content; doc1content.xml: doc2content.xml: Note that because you can't put the document element in an external parsed entity, it is included in the doc wrappers and in the list. Hope this helps. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com XML Standards and Product Coordinator HarvestRoad Communications / www.harvestroad.com.au Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gepeto at fenix.puccamp.br Mon Apr 12 19:04:06 1999 From: gepeto at fenix.puccamp.br (Fernando A. Teixeira) Date: Mon Jun 7 17:11:15 2004 Subject: xml linking Message-ID: I'm trying to understand how the extended links work. Could anyone help me. How can I make a N x N link? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Mon Apr 12 19:16:25 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:11:16 2004 Subject: Megginson and XMLNews In-Reply-To: <14095.52616.339600.100498@localhost.localdomain> References: <370FB644.6F92D093@w3.org> <3.0.5.32.19990407163917.00bd2a60@corp> <370FB644.6F92D093@w3.org> Message-ID: <3.0.5.32.19990412100407.00bfd4a0@corp> At 06:23 PM 4/10/99 -0400, David Megginson wrote: >Chris Lilley writes: > > > Walter Underwood wrote: > > > > > Why instead of an xml:lang attribute? > >(And many other very good questions.) I will look into the mismatch >in the ISO 8601 profile; otherwise, however, XMLNews-Story is designed >to be subset-compatible with the XML version of NITF. The W3C Note is here: http://www.w3.org/TR/NOTE-datetime >Many people -- mostly leading technical specialists in the news >industry -- have put a lot of careful work into NITF (n?e UTF) over >coming on a decade now, and we didn't see any good reason to split the >market by introducing a competing format; ... Understandable. As I said about RDF a couple of days back, two formats are almost never better than one. As a search engine writer, my interest is in common conventions across different formats. So far, I've only found one thing that is common to almost every DTD -- the first tag is the title (the major exception is the NAA "adex" DTD for classifieds, where <ad-slug> is the closest thing to a title). HTML has some conventions for search-engine metadata (title, description, keywords, robots). With XML, the administrator needs to map each DTD to these elements -- more work, more chance for error. And if the data is not there, not in a parsable format, or in a separate metadata file, the search engine is handicapped. I expect to ship our next release pre-configured for NITF, but I sure would like to see some common practice beyond <title>. Mostly, our customers would appreciate it, and the people doing searches would get better results. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Mon Apr 12 19:19:49 1999 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:11:16 2004 Subject: Parsing unparsed external entities (was: Standalone documents ...) In-Reply-To: John Cowan's message of "Mon, 12 Apr 1999 12:46:52 -0400" References: <37120556.597B67FC@iol.ie> <3712237C.18BE9009@locke.ccil.org> Message-ID: <usoa5u5ab.fsf_-_@lanber.ansa.co.uk> John> John Cowan <URL:mailto:cowan@locke.ccil.org> 0> In article <3712237C.18BE9009@locke.ccil.org>, 0> John wrote: John> Then you need an application framework capable of recursively John> parsing unparsed entities using XML notation, which AFAIK does John> not yet exist. This is what (sgml-parse) is in DSSSL for. Beware, though, that if the XML concrete syntax is not your default[1], you need to make sure the system identifier includes it. In Jade, this is done with something like (string-append "<osfile>xml.decl" filename), where filename is the system identifier of the external entity - I don't trust myself to get the syntax right for finding that! (but it can be done) [1] i.e. if you write your command lines as [tool] xml.decl mydoc.xml If you're not tied to XML, you might want to use SGML and SUBDOC instead (but I'm not sure how that's supported in the tools). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jelks at jelks.nu Mon Apr 12 20:01:58 1999 From: jelks at jelks.nu (Jelks Cabaniss) Date: Mon Jun 7 17:11:16 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) In-Reply-To: <370A9AE7.BCE60202@w3.org> Message-ID: <NBBBICMNIPCICMKJECCBKEMMCKAA.jelks@jelks.nu> Chris Lilley wrote: > > > My feeling is that there are three classes of implementation, that > > > should all have names: > > > > > > minimal well-formed - never tries to follow external entities > > > full well-formed - always tries to follow external entities > > > full validating - always tries to follow external entities and validates > > Agreed. ... > > > and it should be possible to always derive what class of implementation > > > a particular instance requires. > You don't comment on that sentence, so does it mean you agree? Yes. But see below. > > If there is to be a way to *force* validity by specifying it in the document > > instance, the only way I can see is by amending the spec with > > something like (as I believe you yourself suggested in passing) > > valid="yes" in the declaration. > > Right. With a default of "no", of course. So, this would make the > assertion that the document was valid and that assertions could be > tested and perhaps refuted, by a validating parser. In the case of > "valid="no" or perhaps, valid="wf", a validating parser would do what - > declare the document invalid? Agree, yes, its invalid (so why check it)? > Automatically use a non-validating mode, even if it was normally > validating? > Next question, should there be (in other words, is this something that > should be in the document instance). Yes. But how to do it? If XML 1.1 has a "valid='yes'|'no'" in the declaration, XML 1.1 documents may break when running under an XML 1.0 parser, since the XML 1.0 BNF clearly states what can and can't be in the declaration. Maybe a PI could be formalized similar to the way the stylesheet linking is being done: <?xml-assert implementation="valid"?> (could also be "minimal" or "full" for the well-formed only options you mentioned). /Jelks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Apr 12 22:13:59 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054B25@SOHOS002> Paul wrote: > Excelon implements a DOM interface to XML documents, not to > arbitrary data objects. That doesn't make it a bad product, > but I don't think it is the product Mark is describing. I imagine that > the (imaginary) product that Mark is describing would allow you > to specify your objects in IDL, manipulate them as ordinary > object/method/property Java or C++ objects and get a DOM interface > to them "for free" when you want it. I don't think that that product > exists. That's true, Paul, thanks. I was also imagining that: - when I want the last node from a tree that contains 100,000 nodes that the whole 'document' would not be read into memory. - that I could access the tree as if it was a complete DOM with all the caching and so on being done for me. - that if I perform an XSL-type query I will get the nodes I want, regardless of whether they are in memory or not. I have implemented a very crude version of this. I use the IE5 DOM and with this I retrieve documents from our database using URLs that are a scaled down version of XQL (I can't say I like XPointer). For example: http://[server]/documents/article[@author='Mark']/article.xml would retrieve all 'article' objects with an author attribute of 'Mark', that are children of a node of type 'documents'. This would then be returned to the caller as an XML document, but with a stylesheet PI pointing to 'stylesheets/article.xsl'. (Replacing .xml with .htm would yield the same results but the XML and XSL would be combined for you on the server.) The problem with this is that I have to convert this request to a query on the objects in the hierarchical database in order to populate my DOM. Of course, once in the DOM I can export it as XML or transform it if necessary, so the database does look from the outside like it is one great big XML document. But although I am quite happy with this so far, I can see that you would have to code this up for every type of database, and really it should be a job for the DOM. It really needs a layer like the layer above the database-specific layers in ODBC; it would sit just below the DOM. This layer would obviously need to understand schemas, so it wouldn't be a trivial task to implement. Anyway, my original question was 'is anyone doing anything like this?' and I think the answer is 'nowhere near yet!' Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Mon Apr 12 22:49:11 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054B25@SOHOS002> Message-ID: <000501be8525$a9605720$311fcdcd@total.net> Hi Mark, <Comment> I have implemented a very crude version of this. I use the IE5 DOM and with this I retrieve documents from our database using URLs that are a scaled down version of XQL (I can't say I like XPointer). For example: http://[server]/documents/article[@author='Mark']/article.xml would retrieve all 'article' objects with an author attribute of 'Mark', that are children of a node of type 'documents'. This would then be returned to the caller as an XML document, but with a stylesheet PI pointing to 'stylesheets/article.xsl'. (Replacing .xml with .htm would yield the same results but the XML and XSL would be combined for you on the server.) The problem with this is that I have to convert this request to a query on the objects in the hierarchical database in order to populate my DOM. Of course, once in the DOM I can export it as XML or transform it if necessary, so the database does look from the outside like it is one great big XML document. But although I am quite happy with this so far, I can see that you would have to code this up for every type of database, and really it should be a job for the DOM. It really needs a layer like the layer above the database-specific layers in ODBC; it would sit just below the DOM. This layer would obviously need to understand schemas, so it wouldn't be a trivial task to implement. Anyway, my original question was 'is anyone doing anything like this?' and I think the answer is 'nowhere near yet!' </Comment> <reply> This is an interesting request. Do you want us to explore a bit further your need? a) if you got a DOM interface on a RDB, would this be useful? b) if you would have a ODB with a DOM interface and that the ODB just maintain some virtual memory pages in memory. (i.e. the whole DOM is not in memory at once, only some pages are) Would this be useful? Thanks Mark for your collaboration Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Apr 13 00:13:24 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL References: <000001be848d$19371cd0$1b19da18@ne.mediaone.net> Message-ID: <37117285.689AFEF1@prescod.net> Jonathan Borden wrote: > > Actually, if you take my XMOP project which serializes COM/IDL described > and Java objects into a DOM interface, and bolt it onto eXcelon, this is > pretty much exactly what this would do. XMOP uses either Java reflection or > COM typelibraries (which are compiled IDL and are close but not quite full > fidelity to MIDL itself), and serializes the object into either a DOM or an > XML stream. I think that all such approaches fall apart quickly when you want the DOM to be writable. And if the DOM is *not* writable then I see it as only an optimization for generating XML and then building a DOM for that XML. And even so there are big efficiency issues. If the system doesn't do an XQL->SQL conversion then searching for anything will be hideously slow because you won't be using the native query optimizer. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Tue Apr 13 00:44:53 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:16 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) In-Reply-To: <370F9F35.2A0E63E0@w3.org> Message-ID: <001b01be8536$66df9710$0100007f@eps.inso.com> > Can you post some URIs? Are you willing to share them? I would trust > your servlets to be doing the right thing. I can probably release these. I'll check. I also have a few other bits of code that I'm trying to release. > > I still dislike the encoding information in the PI.... > > (it isn't, in theory, a PI although it looks exactly like one) I am of > quite the opposite point of view - I think that it finally > gives authors the ability to correctly label their documents. Right. My opinion though is that is does the right thing in the wrong place. > The same is true of any label. The encoding declaration in the XML > declaration at least always travels with the document, which is always > handy for ensuring metadata doesn't get lost. Right. The problem is really one of *metadata* not *data*, that is precisely my point. The *.mim proposal provided an *explicit* separation of the two. In retrospect, I must say that *.mim us also woefully insufficient... but that we still need, in some form, a way of encoding, and transporting, in an interoperable manner, the information (metadata) that is needed by *processors* of the data. > But if you are transcoding, you have to fix it anyway - so? Right, but a) You have to fix it by parsing a peice of arbitrary syntax, which proxies etc. will most likely not do, for performance reasons. b) The XML declaration is part of the *document* as specified by the XML 1.0 recommendation, changing the XML declaration changes the *document*, which is a Bad Thing(tm). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Tue Apr 13 01:01:21 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:11:16 2004 Subject: Last Call for the XML Fragment Interchange Rec Message-ID: <3.0.32.19990412180037.00f308e4@pophost.arbortext.com> * Document to review: http://www.w3.org/TR/WD-xml-fragment-19990412 * Last call ends: 1999 April 23 * Send comments to: mailto:www-xml-fragment-comments@w3.org The XML Fragment WG [1] has just published its Final Working Draft of the XML Fragment Interchange Recommendation [2]. A Last Call period starts now and runs until April 23. Its abstract reads: The XML standard supports logical documents composed of possibly several entities. It may be desirable to view or edit one or more of the entities or parts of entities while having no interest, need, or ability to view or edit the entire document. The problem, then, is how to provide to a recipient of such a fragment the appropriate information about the context that fragment had in the larger document that is not available to the recipient. The XML Fragment WG is chartered with defining a way to send fragments of an XML document--regardless of whether the fragments are predetermined entities or not--without having to send all of the containing document up to the part in question. This document defines Version 1.0 of the [eventual] W3C Recommendation that addresses this issue. Comments are solicited from all W3C WGs and the public at this time. As indicated in the document, comments should be sent to [3], (a publicly archived list). Comments received by 1999 April 23 will be considered for the Proposed Recommendation version. All comments from W3C working groups and from recognized liaison groups will be considered in light of the XML Fragment Requirements Document [4]. In particular, basic scope issues and design decisions will be reconsidered only when grave and previously unrecognized flaws are uncovered. Requests for enhancement will typically be deferred for later versions of the specification under development unless the enhancement is uncontroversial and its incorporation would not materially delay production of the specification. Paul Grosso XML Fragment WG Chair Daniel Veillard W3C Staff Contact [1] http://www.w3.org/XML/Activity.html#fragment-wg [2] http://www.w3.org/TR/WD-xml-fragment-19990412 [3] mailto:www-xml-fragment-comments@w3.org [4] http://www.w3.org/TR/NOTE-XML-FRAG-REQ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at eng.sun.com Tue Apr 13 02:57:09 1999 From: db at eng.sun.com (David Brownell) Date: Mon Jun 7 17:11:16 2004 Subject: problem with IE5 References: <21172.199904121418@doyle.cogsci.ed.ac.uk> Message-ID: <371295E9.B36ABC78@Eng.Sun.COM> Richard Tobin wrote: > > > Looks to me like: > > > (b) IE5 however REQUIRES conformance to the namespace spec, > > and thus rejects some well formed XML 1.0 documents, > > such as Richard's original; > > In what way does my document (below) not conform to the namespace > recommendation? My goof ... some other examples I tried get rejected however, including ":some:long:xml_1.0:names". I'm seeking an accurate description of the syntax that IE5 supports, and "XML 1.0" doesn't seem to be it ... neither does "XML 1.0 but requiring XML namespaces". > It contains no qualified names in the body, and > prefixes in the DTD are not required (and not able) to be declared. Prefixes in the DTD can be declared, but in this case they weren't ... more to the point, they didn't need to be!! > > > <?xml version="1.0"?> > > > <!DOCTYPE test [ > > > <!ELEMENT test ANY> > > > <!ELEMENT foo:bar ANY> > > > ]> > > > <test/> - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Tue Apr 13 03:02:59 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:11:16 2004 Subject: Standalone documents as external parsed entities. References: <37120556.597B67FC@iol.ie> <00ce01be8504$3cbdfd00$0300000a@cygnus.uwa.edu.au> Message-ID: <371299F0.E53840F2@iol.ie> James Tauber wrote: > > There are ways around it. But for starters, be careful with the term > "standalone" as it means something quite specific in XML (and something > different from what I'm guessing you mean by it). Thanks James. I actually had already started down the path of option number two that you suggested, i.e. using "wrapper" documents, with DTDs, to refer to external entities, w/o DTDs, that contain the actual document. I need a DTD on these documents because I need to constrain their structure. I was hoping there was a better way, since this doubles the number of documents I have to manage, but it appears there isn't. I consider this to be a shortcoming of XML, in that it is not "orthogonal", i.e. I have to write my documents in one of two different ways, depending on how they're going to be used. A better solution, I believe, would be to take a more "object-oriented" approach, i.e. that each document is responsible for it's own validity, through the use of its own DTD. This would require a parser that could handle recursively nested documents, each with their own DTD. Although I could adopt such a non-standard solution here in my own environment, and produce HTML for publication, I couldn't publish the XML/XSL, since the documents would be non-standard and unreadable by anyone else. I keep hearing that XML is a "data" language, as opposed to a "document" language, but I think that this is one case where XML breaks widely accepted data modelling norms, i.e. type encapsulation. Thanks all, Alan. P.S. James, after I sent that mail, I realised that my documents are not actually standalone (in XML terms), since they refer to an external DTD, so I used the term incorrectly. But, then I realised they could actually be standalone, by making all of the necessary declarations in the internal subset, and the problem would still be the same. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 13 03:59:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:16 2004 Subject: Megginson and XMLNews In-Reply-To: <3.0.5.32.19990412100407.00bfd4a0@corp> References: <370FB644.6F92D093@w3.org> <3.0.5.32.19990407163917.00bd2a60@corp> <14095.52616.339600.100498@localhost.localdomain> <3.0.5.32.19990412100407.00bfd4a0@corp> Message-ID: <14098.40792.709506.370200@localhost.localdomain> Walter Underwood writes: > I expect to ship our next release pre-configured for NITF, That's wonderful. > but I sure would like to see some common practice beyond <title>. > Mostly, our customers would appreciate it, and the people doing > searches would get better results. Actually, I think that you need something a little more robust -- otherwise, we'll end up with a hodge-podge of rules for what element names people can and cannot use. I would not want to forbid someone from using something like this: <?xml version="1.0"?> <person> <title>Dr. Charles Goldfarb Originator of SGML. Universal names (as in "Namespaces in XML") get you part way there, because different document types can share semantics of well-known element types: This is the book title [...] What's really useful, though, is to develop some kind of inheritance scheme, so that you can say "this is just like an html:title, except that it's also a little more specialised". Architectural forms provide a very lightweight mechanism for this; XML Schemas will probably provide another. Personally, I'd love to see NITF take advantage of namespaces, even to a very small extent. To start, a simple default namespace would be nice: Simple Story Simple Story By David Megginson
This is a simple story that mentions Shakespeare in Love.
This would allow other document types to reuse NITF components in a well-defined way, and search engines to recognise them wherever they're used. Right now, we're not doing this in XMLNews-Story because we want to remain strictly subset-compatible with NITF, but we'll certainly encourage the NITF people to consider updating the spec. In fact, since NITF borrows heavily from HTML (and also a bit from HyTime, though that part is not included in the XMLNews-Story subset), it would be nice to put the HTML stuff in a separate namespaces so that search engines and other processing software can do something useful with it even if they do not know NITF itself: Simple Story Simple Story By David Megginson This is a simple story that mentions Shakespeare in Love. This might help a bit with the search engine problem. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Tue Apr 13 06:08:24 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:16 2004 Subject: Competition: fast XML parser for charset labelling In-Reply-To: <370F7DCF.31DC121D@w3.org> Message-ID: <003501be8563$98f8eee0$0100007f@eps.inso.com> > Integrate this into the Apache mod_mime (or other suitable place) so > that, for all resources which mod_mime declares to be of type > text/xml, the cached result is automatically used to output a MIME type header I think you'll find the cost of the cached result to be one the same order (in terms of cost) as dynamically parsing the documents. I have found that in such cases, basic system call overhead and CPU usage is roughly the same (open and close calls, a couple of reads). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Apr 13 07:30:27 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:16 2004 Subject: Comments Appreciated on Magazine Based on XML/XSL In-Reply-To: Message-ID: <000d01be856d$bcabae90$1b19da18@ne.mediaone.net> Mark Birbeck wrote: > > I was also imagining that: > > - when I want the last node from a tree that contains 100,000 nodes > that the whole 'document' would not be read into memory. > - that I could access the tree as if it was a complete DOM with > all the caching and so on being done for me. > - that if I perform an XSL-type query I will get the nodes I want, > regardless of whether they are in memory or not. > > I have implemented a very crude version of this. I use the IE5 DOM and > with this I retrieve documents from our database using URLs that are a > scaled down version of XQL (I can't say I like XPointer). For example: > > http://[server]/documents/article[@author='Mark']/article.xml Mark, as Didier mentions, this is an interesting request. In this case, does the document itself need be stored in a database (as opposed to a file 'article.xml') or do you intend to maintain metadata in a directory structure? Is the request, then, for a directory with a XQL interface? or have you implemented this already? (I'm having some difficulty determining what you have already implemented versus what you are ideally requesting). > > The problem with this is that I have to convert this request to a query > on the objects in the hierarchical database in order to populate my DOM. > Of course, once in the DOM I can export it as XML or transform it if > necessary, so the database does look from the outside like it is one > great big XML document. For example, eXcelon has an XQL interface so no conversion is required (though you need to convert your $$$ into the licenses :-( > > But although I am quite happy with this so far, I can see that you would > have to code this up for every type of database, and really it should be > a job for the DOM. It really needs a layer like the layer above the > database-specific layers in ODBC; it would sit just below the DOM. This > layer would obviously need to understand schemas, so it wouldn't be a > trivial task to implement. > > Anyway, my original question was 'is anyone doing anything like this?' > and I think the answer is 'nowhere near yet!' > I've done a bit of work converting schemas back and forth between ODBC/SQL queries using XSL transformations. My latest medical repository (HL7 object model) uses a relational back-end with XSL transformations to convert to and from XML documents. I haven't (yet) put an XQL layer on this (the current layer is an XML representation of SQL queries. If anyone has done work to convert the XQL string into an XML intermediary, it should be possible to convert that into my XML SQL representation via an XSL transformation as well. Would this help? i.e. what I propose is: XQL <-> intermediate XML representaton <-XSL-> XML SQL representation -XSL-> SQL Although I have done work using native DOM wrappers on databases, you still need an XQL engine. If the db schema can be represented as relational tables, queries are optimized. The problem is that each level of the hierarchy maps to a relational join (with a unique identifier primary key). At least that's how I do it. Does anyone have a better solution that can be shared? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Tue Apr 13 15:09:47 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:17 2004 Subject: Competition: fast XML parser for charset labelling References: <003501be8563$98f8eee0$0100007f@eps.inso.com> Message-ID: <371340E0.6D8194E6@w3.org> Gavin Thomas Nicol wrote: > > > Integrate this into the Apache mod_mime (or other suitable place) so > > that, for all resources which mod_mime declares to be of type > > text/xml, the cached result is automatically used to output a MIME type > header > > I think you'll find the cost of the cached result to be one the same order > (in terms of cost) as dynamically parsing the documents. I have found that > in such cases, basic system call overhead and CPU usage is roughly the same > (open and close calls, a couple of reads). Thanks for sharing that experience; well, it certainly makes the code even easier! -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Tue Apr 13 15:16:26 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:17 2004 Subject: Competition: fast XML parser for charset labelling In-Reply-To: <371340E0.6D8194E6@w3.org> Message-ID: <005301be85b0$2548aa10$0100007f@eps.inso.com> > > I think you'll find the cost of the cached result to be on the same order > > (in terms of cost) as dynamically parsing the documents. I have found that > > in such cases, basic system call overhead and CPU usage is roughly the same > > (open and close calls, a couple of reads). > > Thanks for sharing that experience; well, it certainly makes the code > even easier! ... and more flexible. Cached results are persistent and can get out of date. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Tue Apr 13 16:42:53 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:11:17 2004 Subject: ANNOUNCE: xhtml modularization draft Message-ID: <003f01be85bb$98e3d840$29afdccf@ix.netcom.com> The working draft on the modularization of XHTML is available at: http://www.w3.org/TR/xhtml-modularization/Overview.html#toc Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (Aug) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Tue Apr 13 17:59:38 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:11:17 2004 Subject: Last Call for the XML Fragment Interchange Rec Message-ID: <3.0.32.19990413105745.006a3324@pophost.arbortext.com> Many apologies, despite all my attempts at care and testing of URLs, I have made a mistake. The document to review is currently (and will stay) at: http://www.w3.org/TR/1999/WD-xml-fragment-19990412 during the Last Call period. At 18:01 1999 04 12 -0500, Paul Grosso wrote: > * Document to review: > http://www.w3.org/TR/WD-xml-fragment-19990412 [Wrong!] > * Last call ends: 1999 April 23 > * Send comments to: mailto:www-xml-fragment-comments@w3.org > >The XML Fragment WG [1] has just published its Final Working Draft >of the XML Fragment Interchange Recommendation [2]. A Last Call >period starts now and runs until April 23. Its abstract reads: >[1] http://www.w3.org/XML/Activity.html#fragment-wg >[2] http://www.w3.org/TR/WD-xml-fragment-19990412 [Wrong!] >[3] mailto:www-xml-fragment-comments@w3.org >[4] http://www.w3.org/TR/NOTE-XML-FRAG-REQ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Tue Apr 13 19:26:55 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:11:17 2004 Subject: Megginson and XMLNews In-Reply-To: <14098.40792.709506.370200@localhost.localdomain> References: <3.0.5.32.19990412100407.00bfd4a0@corp> <370FB644.6F92D093@w3.org> <3.0.5.32.19990407163917.00bd2a60@corp> <14095.52616.339600.100498@localhost.localdomain> <3.0.5.32.19990412100407.00bfd4a0@corp> Message-ID: <3.0.5.32.19990413101500.00ce5cd0@corp> At 06:53 PM 4/12/99 -0700, David Megginson wrote: >Walter Underwood writes: > > > I expect to ship our next release pre-configured for NITF, > >That's wonderful. > > > but I sure would like to see some common practice beyond . > > Mostly, our customers would appreciate it, and the people doing > > searches would get better results. > >Actually, I think that you need something a little more robust -- >otherwise, we'll end up with a hodge-podge of rules for what element >names people can and cannot use. I would not want to forbid someone >from using something like this: > > <?xml version="1.0"?> > > <person> > <title>Dr. Right, though documents (as opposed to datafiles in XML) nearly always have something like: List of Contributors before the other uses of title. I see this a lot in bibliographies. This is a statistical bet, but then, half of information retrieval is statistics, so I'm used to playing that game (the other half is human behavior, both in authors and searchers). >Universal names (as in "Namespaces in XML") get you part way there, >because different document types can share semantics of well-known >element types: It probably gets us all the way there if the elements are used the same way. This doesn't require any formal equivalence between names, just a convention that things named the same work the same. In other words, a SmallTalk object protocol is sufficient here; there is no necessity for Java's Interface type. Other tools may find that useful, but it is not necessary for search engines. This is very similar to the XLink approach, that is, write the linking parts of your DTD like this. Search engines need XLink, too, of course. A convention for a Dublin Core namespace, plus a robots tag, would be just peachy, if people actually used it. For example: Helping Your Child Learn History Wrisley Reed, Elaine 1997-10-27 Activities that adults can do with their children to help them learn history from the every day world around them. Provides resources, local and national resources, and activities for children aged 4-11.Sections include: History Education Begins at Home; The Basics of History; Activities: History as Story; Activities: History as Time; and much more Social Studies History Informal Education index, nofollow That would be wonderful. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Tue Apr 13 19:47:25 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:17 2004 Subject: recursion in XML parser Message-ID: <199904131917.MAA30937@0000000.com> Are most XML parsers recursive in nature? My parser in non-recursing while processing the tags from an XML file and only recurses once to go back and load an XSL file, when applicable. My reasoning for not using recursion was performance (function call/stack framing considerations) and that it made the code easier to understand. It would be interesting to do some benchmarks on various parsers out there to measure performance. The Java parsers I've tested (Sun, IBM) are _dog_ slow compared to expat, etc. For server-side I don't think that matters, since in the corporate scene people tend to just add more servers/infrastructure and not worry about performance. Client-side XML is a completely different kettle o' fish tho' since you can't just keep popping in processors every time your machine at home/work bogs down. Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Apr 13 19:51:54 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:17 2004 Subject: Competition: fast XML parser for charset labelling Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1789@EUKBANT101> > -----Original Message----- > From: Chris Lilley [SMTP:chris@w3.org] > > Gavin Thomas Nicol wrote: > > > > > Integrate this into the Apache mod_mime (or other suitable place) so > > > that, for all resources which mod_mime declares to be of type > > > text/xml, the cached result is automatically used to output a MIME > type > > header > > > > I think you'll find the cost of the cached result to be one the same > order > > (in terms of cost) as dynamically parsing the documents. I have found > that > > in such cases, basic system call overhead and CPU usage is roughly the > same > > (open and close calls, a couple of reads). > > Thanks for sharing that experience; well, it certainly makes the code > even easier! > Just got my copy of "Apache modules in Perl and C" today, so it should be into the level of "trivial" rsn... I've got the first bit: if (/<\?xml(.*?)\?>/) { ... } :-) Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Casey at echostar.com Tue Apr 13 19:52:33 1999 From: Mark.Casey at echostar.com (Casey, Mark) Date: Mon Jun 7 17:11:17 2004 Subject: XML Parser in Unix C++ (perhaps DEC UNIX?) Message-ID: <8E7905420FB9D211916A00609773FB0E01616AB2@exchange1.echostar.com> Hi, I'm just joining this group, hello to everyone! We're looking for a Unix C++ parser to use on a DEC UNIX project... man, you'd think with all you Unix C++ folks out there, there'd be many! But I've yet to find one, after locating over 25 different ones on the internet. I've come upon several in C++, but other than SP (which is might compile ok but seems a little more than the simple XML parser I'm looking for), the others have either been not in C++, or very difficult to work with under DEC UNIX. Note: Not looking for a C XML parser (xpat, grove annex, rxp), but rather a C++ XML parser. Would prefer to stick to C++ (we've integrated some C components and they've been a pain to modify, and usually expose features of C that we moved to C++ to avoid). So we'll use one only if a C++ version is unavailable. Special thanks to James Clark for the excellent pioneering work in this area. Here's my C++ XML Parser list so far: 1. SP (James Clark) 2. AntLr (SGML parser that needs a C++ def to output to C++ instead of it's usual Java) 3. WinFoundationClasses (despite it's name, claims to compile under Unix ??) 4. XML Parser for Delphi (can be ported to C++, but the obj is currently in Borland C++ Builder) 5. Balise (Win, NT) 6. MS XML parser (Win) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Mark Casey - Sr Engineer NagraStar LLC - an advanced technology joint venture of http://www.NagraVision.com and http://www.Echostar.com http://www.DishNetwork.com 90 Inverness Circle East, Englewood, CO USA 80112 303-706-5710 voice w/mail 303-706-5719 fax w/paper casey@nagrastar.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "ESCHEW OBSFUCATION!" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Apr 13 20:26:50 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:17 2004 Subject: Mozilla and expat Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A178A@EUKBANT101> I don't know how many people here follow mozilla, but expat has now been enabled in mozilla by default, which means you can all go ahead and hammer it for namespace and XML compliance... NB: This should be in the nightly builds, but not in the gecko preview release. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Tue Apr 13 20:30:34 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:11:17 2004 Subject: Congrats Jon Bosak and Tim Bray Message-ID: Title(s): "XML and the Second-Generation Web. The combination of hypertext and a global Internet started a revolution. A new ingredient, XML, is poised to finish the job." AKA, "How XML Will Fix the Web" Summary: "Extensible Markup Language (XML), a tool for writing World Wide Web pages, promises another on-line revolution. Pages written in XML can deliver needed information more quickly and efciently than HTML pages can. They can also automatically reformat themselves for convenient access by computer, telephone, handheld organizer or other devices." Scientific American Cover story, Feature Article May, 1999 http://www.sciam.com/1999/0599issue/0599bosak.html ---- -robin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Tue Apr 13 21:49:13 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:17 2004 Subject: recursion in XML parser Message-ID: <199904132119.OAA31093@0000000.com> By XML-recommendations in non-validating mode are you talking about the section on conformance in the XML spec (page 24)? I assume not... where would I find this? I can see that on a Pentium 300 you wouldn't notice the function call overhead/stack framing for recursive processing. Thanks. My own experience using the Java XML frameworks is that they are slow. There may be some that aren't. I'm [for now] living under the assumption that for a large dataset, Java will be slow because of all of the string handling and internal pointer frenzy that the VM is handling for you. Then again, it's entirely possible that for instance if you use the new collection classes, that the underlying implementation is written in C and there would be a limit to the performance tradeoffs. One nice feature that using JDBC and related technologies for data handling in Java is that Java itself doesn't have to manipulate the [possibly large] datasets. For a given XML "pure Java" implementation of an XML parser it seems that there are going to be problems manipulating datasets of epic proportions. I noticed delays when processing the Richard II xml document [famous] in Java frameworks. That document is around 300kb of XML. It's interesting to know that you've done both a recursive and non- recursive XML parser. Thanks for your response, Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pstandre at lds.com Tue Apr 13 22:28:30 1999 From: pstandre at lds.com (Peter Saint-Andre) Date: Mon Jun 7 17:11:17 2004 Subject: Using notations to identify content formats Message-ID: <3713A7ED.58EF4E2B@lds.com> I have created a DTD that parses data in XML for a web application. The DTD identifies valid values for many data elements using enumerated attributes, but I also would like to identify formats for certain data elements (e.g., dates as nine-digit numeric strings). Unfortunately, there seems to be a paucity of information regarding how to create formal specs for the documentation of notations in XML. Any suggestions would be greatly appreciated. Thanks! Peter -- Peter Saint-Andre Logical Design Solutions, Inc. http://www.lds.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Apr 13 22:35:47 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:17 2004 Subject: Congrats Jon Bosak and Tim Bray In-Reply-To: Message-ID: <199904132035.QAA22198@hesketh.net> At 01:30 PM 4/13/99 -0500, Robin Cover wrote: >Title(s): >"XML and the Second-Generation Web. The combination of >hypertext and a global Internet started a revolution. >A new ingredient, XML, is poised to finish the job." >Scientific American >Cover story, Feature Article >May, 1999 > >http://www.sciam.com/1999/0599issue/0599bosak.html Nice article, though once again CSS is ignored and the strange new creature known as XSL is promoted as "The standard now being developed for XML stylesheets", a misuse of the definite article 'the'. That's okay - I'm sure SA's readers (myself included) will be happy to know that they can present their information at some indefinite point in the future, and that they can do it now is probably irrelevant. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Apr 13 22:44:04 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:17 2004 Subject: Using notations to identify content formats References: <3713A7ED.58EF4E2B@lds.com> Message-ID: <3713AC7B.6AFFEF15@locke.ccil.org> Peter Saint-Andre wrote: > Unfortunately, > there seems to be a paucity of information regarding how to create > formal specs for the documentation of notations in XML. Any suggestions > would be greatly appreciated. IMHO, natural-language (English, French, or whatever) descriptions are generally preferred to formal ones, as they are intelligible to more people and generally "live" longer. A document that says The MagicDate format is as follows: yyyywwd, where yyyy is a zero-padded Gregorian year number, ww is a 2-digit week, and d is the day of the week. Week 01 of a year is the week containing January 4. Day 0 is Sunday, day 6 is Saturday. does very nicely. Then you make it retrievable at http://your.domain/defs/MagicDate.txt and then declare the notation in your DTDs as -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pstandre at lds.com Tue Apr 13 23:00:57 1999 From: pstandre at lds.com (Peter Saint-Andre) Date: Mon Jun 7 17:11:17 2004 Subject: Using notations to identify content formats References: <3713A7ED.58EF4E2B@lds.com> <3713AC7B.6AFFEF15@locke.ccil.org> Message-ID: <3713AF85.71F60173@lds.com> John Cowan wrote: > IMHO, natural-language (English, French, or whatever) descriptions > are generally preferred to formal ones, as they are intelligible > to more people and generally "live" longer. What you say makes sense to help authors understand the notation format, but I want the ~application~ to "understand" the format so that it can validate the data at some level (e.g., a social security number has to be nine digits in length). Right now my DTD is validating certain data through enumerated attributes (e.g., when there is a limited number of valid values), but it's not validating data for which validation relates to format. Perhaps I'm asking for too much.... Peter xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Tue Apr 13 23:23:56 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:11:18 2004 Subject: xml linking Message-ID: <3.0.5.32.19990413142143.009fd5b0@pophost.fsc.fujitsu.com> On Mon, 12 Apr 1999 at 14:05:00 Fernando A. Teixeira wrote: >I'm trying to understand how the extended links work. Could anyone help >me. How can I make a N x N link? Take a look at Fujitsu's "HyBrick" and the sample files we've provided. You'll find the HyBrick home page at: http://www.fsc.fujitsu.com/hybrick/ Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Apr 13 23:27:43 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:18 2004 Subject: Using notations to identify content formats References: <3713A7ED.58EF4E2B@lds.com> <3713AC7B.6AFFEF15@locke.ccil.org> <3713AF85.71F60173@lds.com> Message-ID: <3713B6BB.14086CB@locke.ccil.org> Peter Saint-Andre wrote: > What you say makes sense to help authors understand the notation format, And application developers, too. But if you want to have pluggable application code already available, you could: write a Perl regex provide a Java class provide C source which the application should compile and dynaload (:-)) and put any of these into the resource referenced by the URI in the notation declaration. The MIME type of the resource could tell you which one it is. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SCODEB at saif.com Wed Apr 14 00:23:41 1999 From: SCODEB at saif.com (Scott Deboy) Date: Mon Jun 7 17:11:18 2004 Subject: Suggestions Message-ID: <3FF1D0E3A6ACD11192BB0001FA16A73A0162CCBC@saif.com> I'm new to XML (reading the spec and learning about DTDs etc.) and I was hoping someone could point me in the right direction on an idea I have to replace Word macros w/a 3-tier system using XML. Example: I want to build a letter macro using XML. The letter is mostly static text. An address is required, as well as a couple of other fields. Also, the letter has a couple of optional paragraphs that I need to prompt the user to answer. My idea (obviously lots of gaps but bear with me please): First I would store the static text and optional text in XML format in a relational database. Next I would define the structure of the macro in XML format as well - basically a roadmap of what components need to be combined and in what order to complete the document: the boilerplate, optional paragraphs, signature block, etc. The optional paragraph XML would also contain the question to ask the user, for example: Select a color: blue, red, green. The macro structure would reference the appropriate fields in the database. The user is prompted for fields unique to this letter, as well as address, signature, etc. This response is formatted in XML and sent to an app server. The app server could parse through the user's response, parse through the macro 'roadmap', retrieve text and other data from the database, combine the pieces of the document and return the completed document to the client. This would also allow the app server to work as the engine for a batch letter processor as well. I will want to make this app server handle any of the macros we currently create in Word. Any big mistakes here? Any pointers? Should I be sending this to a different list? Should I be thinking about using XQL or do I need XSL? there are so many acronyms I haven't had a chance to figure out what would be useful in this architecture. Any pointers are appreciated - Scott Deboy SAIF Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smo at jst.com.au Wed Apr 14 02:41:11 1999 From: smo at jst.com.au (Steve Oldmeadow) Date: Mon Jun 7 17:11:18 2004 Subject: XML Parser in Unix C++ (perhaps DEC UNIX?) Message-ID: <005501be860e$eaf8f180$0201a8c0@pikachu> You may want to have a look at http://www.halcyon.com/www3/jesjones/Whisper/Home.html Whisper is a cross platform framework for Macintosh and Win32 but it includes the source to an XML parser. I had a look at the source code and it looks like it would be fairly easy to port to other platforms. The only problem I could see was that it does not follow the SAX or DOM standards but the same principles are used and it probably wouldn't be too hard to create a partly compliant DOM or SAX wrapper around what is there. You may also want to check out the GNOME and KDE linux projects. Both are using XML in one way or another. I use GNOME and noticed that there is a libxml library but have not had the chance to check out the source yet, I'd guess it is C based though. Steve Oldmeadow Justice Systems Technologies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Apr 14 07:40:27 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:18 2004 Subject: Suggestions References: <3FF1D0E3A6ACD11192BB0001FA16A73A0162CCBC@saif.com> Message-ID: <37142AE0.F8285680@jtauber.com> Scott Deboy wrote: > > I'm new to XML (reading the spec and learning about DTDs etc.) and I was > hoping someone could point me in the right direction on an idea I have to > replace Word macros w/a 3-tier system using XML. > > Example: I want to build a letter macro using XML. > > The letter is mostly static text. An address is required, as well as a > couple of other fields. Also, the letter has a couple of optional > paragraphs that I need to prompt the user to answer. A better approach would be to represent the user provided information as an XML document which is then given to an XSL engine to add the static text. So your template/macro would be a combination of a DTD constraining user-provided information, and an XSL stylesheet that takes that information and produces the output. Here's a really simple example. Say you are writing thank you notes for an engagement party. For each person you have a document like: John vase that indicates the person's name, what gift they gave and whether they are coming to the wedding. Here's a DTD: You then have an XSL stylesheet with a template such as:
Dear ,

Thank you so much for your .

Look forward to seeing you at the wedding.

James
Is this the sort of thing you wanted to do? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 14 10:10:18 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:18 2004 Subject: recursion in XML parser In-Reply-To: <199904132119.OAA31093@0000000.com> References: <199904132119.OAA31093@0000000.com> Message-ID: <14099.53301.541630.649928@localhost.localdomain> xml writes: > I can see that on a Pentium 300 you wouldn't notice the function > call overhead/stack framing for recursive processing. Thanks. My > own experience using the Java XML frameworks is that they are slow. Hmm -- they are somewhat slower than Expat, but that's because they're running tight code loops in a virtual machine. Still, when I was testing AElfred on a 166MHZ Pentium NT box back in late 1997, it could parse about 1MB/second with a good VM and a JIT, and the other good XML parsers are comparable in speed. Granted, Expat (with memory-mapped I/O) is about 10 times as fast as the faster Java-based XML parser, but that's a very misleading figure: in fact, the actual parsing usually occupies only a small amount of the time required for XML processing -- most of the time is usually taken up by your code that actually does something with the XML. Let's assume, then, that XML parsing occupies 10% of your application's overhead. Even if you could build a parser that is 1000% faster, you'd still gain only 9% in actual execution speed. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Wed Apr 14 16:16:15 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:18 2004 Subject: recursion in XML parser In-Reply-To: <14099.53301.541630.649928@localhost.localdomain> Message-ID: <000b01be867e$df8b4440$124d8bcf@total.net> Hi David, Hmm -- they are somewhat slower than Expat, but that's because they're running tight code loops in a virtual machine. Still, when I was testing AElfred on a 166MHZ Pentium NT box back in late 1997, it could parse about 1MB/second with a good VM and a JIT, and the other good XML parsers are comparable in speed. Granted, Expat (with memory-mapped I/O) is about 10 times as fast as the faster Java-based XML parser, but that's a very misleading figure: in fact, the actual parsing usually occupies only a small amount of the time required for XML processing -- most of the time is usually taken up by your code that actually does something with the XML. Let's assume, then, that XML parsing occupies 10% of your application's overhead. Even if you could build a parser that is 1000% faster, you'd still gain only 9% in actual execution speed. I think that nobody would argue that Java has a lot of virtues that certainly speed of not one of them. To take your numbers David, If that part of the application is 10 times faster than any Java parser and that the app itself is 10 times faster also. The overall throughput is therefore 10 times faster. Which, in certain circumstances is what's required. We would then have an apps with a 10 times faster throughput. The state of the art for Java may change in the future as soon as other players like HP, Novell and IBM bring to the table their own technology and that Java would finally have the same competitive environment as other languages have. This could be beneficial for the language evolution as more brains think on how to improve the performances. regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Apr 14 16:24:19 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:18 2004 Subject: XSL as XML transformation Message-ID: <84285D7CF8E9D2119B1100805FD40F9F255159@MDYNYCMSX1> Mike is being modest here. He fails to mention that, for tackling problems such as those he describes, his SAXON Java library is really great. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. > ---------- > From: Kay Michael[SMTP:Michael.Kay@icl.com] > Sent: Thursday, April 08, 1999 2:59 PM > To: xml-dev@ic.ac.uk > Subject: RE: XSL as XML transformation > > > Does this mean that its possible to use XSL to do transformation > > from one XML document type to another? > > Yes. > > > If so are there any gotchas that I should be concerned with? > > Yes. > > 1. There must be a one-to-one mapping of input documents to output > documents. > > 2. There are no facilities for algorithmic transformations of > attributes or element content. > > 3. There are no facilities for adding "grouping" nodes (e.g. changing > ?
to > ?
> > Mike Kay > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SCODEB at saif.com Wed Apr 14 17:21:04 1999 From: SCODEB at saif.com (Scott Deboy) Date: Mon Jun 7 17:11:18 2004 Subject: Suggestions Message-ID: <3FF1D0E3A6ACD11192BB0001FA16A73A0162CCBE@saif.com> Thanks for the suggestions, James Yes, this is 80% of what I'm looking for. The other 20% is: If I could 'include' a reference to the optional paragraph (which will live in a database) instead of directly embedding it in the XSL template (I want to reuse the same optional paragraph in a number of documents). What if I store the paragraph references as database selects in the XSL template and do the work of retrieving them from the database before the XSL engine does its work? Is there a better way? It sounds like it's time to read up on XSL. Scott > ---------- > From: James Tauber[SMTP:jtauber@jtauber.com] > Sent: April 13, 1999 10:42 PM > To: Scott Deboy > Cc: 'xml-dev@ic.ac.uk' > Subject: Re: Suggestions > > Scott Deboy wrote: > > > > I'm new to XML (reading the spec and learning about DTDs etc.) and I was > > hoping someone could point me in the right direction on an idea I have > to > > replace Word macros w/a 3-tier system using XML. > > > > Example: I want to build a letter macro using XML. > > > > The letter is mostly static text. An address is required, as well as a > > couple of other fields. Also, the letter has a couple of optional > > paragraphs that I need to prompt the user to answer. > > A better approach would be to represent the user provided information as > an XML document which is then given to an XSL engine to add the static > text. So your template/macro would be a combination of a DTD > constraining user-provided information, and an XSL stylesheet that takes > that information and produces the output. > > Here's a really simple example. > > Say you are writing thank you notes for an engagement party. > > For each person you have a document like: > > > John > vase > > > that indicates the person's name, what gift they gave and whether they > are coming to the wedding. Here's a DTD: > > > > > > > You then have an XSL stylesheet with a template such as: > > >
Dear ,
>
Thank you so much for your .
> > >
Look forward to seeing you at the wedding.
> > >
James
> > > Is this the sort of thing you wanted to do? > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Wed Apr 14 17:21:31 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:18 2004 Subject: recursion in XML parser References: <000b01be867e$df8b4440$124d8bcf@total.net> Message-ID: <3714B23F.F405D74D@pacbell.net> Didier PH Martin wrote: > > David Megginson wrote: > > > Let's assume, then, that XML parsing occupies 10% of your > > application's overhead. Even if you could build a parser that is > > 1000% faster, you'd still gain only 9% in actual execution speed. Which, by the by, is a fairly common tradeoff in distributed systems ... except that parsing data generally takes a lot LESS than 10% of the application overhead. I can't see XML changing that equation a heck of a lot. Also, in the big picture, execution speed is not the only important factor in system development. It's often important to have the system done twice as fast (a number of studies have shown that Java programmers are twice as productive as C/C++ ones), or be more stable when it's been declared "feature complete" (e.g. no pointer smashes). > The state of the art for > Java may change in the future as soon as other players like HP, Novell and > IBM bring to the table their own technology and that Java would finally have > the same competitive environment as other languages have. And Sun, too. One should also keep in mind that C based systems have been evolving for 20+ years at this point, vs a lot less for Java. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Apr 14 17:24:26 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:18 2004 Subject: Congrats Jon Bosak and Tim Bray Message-ID: <84285D7CF8E9D2119B1100805FD40F9F25515A@MDYNYCMSX1> Simon St. Laurent writes: >Nice article, though once again CSS is ignored and the strange new creature >known as XSL is promoted as "The standard now being developed for XML >stylesheets", a misuse of the definite article 'the'. Considering the qualifier following the definite article, either they did use the definite article properly or CSS is "now being developed." Since CSS2 is a Recommendation, I assume you're referring to CSS3, but I couldn't find anything about it on www.w3.org. Where can I find out more? Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 14 18:18:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:18 2004 Subject: recursion in XML parser In-Reply-To: <000b01be867e$df8b4440$124d8bcf@total.net> References: <14099.53301.541630.649928@localhost.localdomain> <000b01be867e$df8b4440$124d8bcf@total.net> Message-ID: <14100.48776.595362.585073@localhost.localdomain> Didier PH Martin writes: > I think that nobody would argue that Java has a lot of virtues that > certainly speed of not one of them. To take your numbers David, If > that part of the application is 10 times faster than any Java > parser and that the app itself is 10 times faster also. The overall > throughput is therefore 10 times faster. That's not necessarily the case -- C/C++ have some advantages for fast I/O that Java doesn't share, but if your other code is not I/O-bound, and if it doesn't require small, tight processing loops, the speed difference for the non-parsing code might be much less significant (depending on how efficient your VM and OS are at memory-management). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From creitzel at mediaone.net Wed Apr 14 18:20:39 1999 From: creitzel at mediaone.net (Charles Reitzel) Date: Mon Jun 7 17:11:18 2004 Subject: No subject Message-ID: <199904141620.MAA20017@chmls05.mediaone.net> I saw this comment and felt compelled to share my thoughts. While I agree data type inheritance is a useful thing, it seems outside the scope of XML. But there are alternatives. David Megginson wrote: >Universal names (as in "Namespaces in XML") get you part way there, >because different document types can share semantics of well-known >element types: ... namespace example snipped ... >What's really useful, though, is to develop some kind of inheritance >scheme, so that you can say "this is just like an html:title, except >that it's also a little more specialised". Architectural forms >provide a very lightweight mechanism for this; XML Schemas will >probably provide another. I would suggest to folks who need this level of data type reuse and/or sophistication that they look into ASN.1. This is the ISO data type definition language used to define the SSL, SNMP, X.500 and LDAP protocols. The language itself is very general and has many benefits. It supports all the basic data types and compound data types (i.e. C++ structs). It supports inheritance for the same. There is a global object namespace (Object ID, or OID for short), wherein each firm or organization can register a base ID namespace in which all "custom" object types can be defined. Thus, they (the ISO) have solved both the FPI and namespace problems that has caused XML-Dev'ers so much grief. Perhaps most important, all datatypes defined using ASN.1 have a standard, unambiguous method of encoding the data for transmission on the wire. These rules are known as the "Basic Encoding Rules", or BER, of ASN.1. Certainly, a general ASN.1 parser is non-trivial. The subset required for any given protocol (e.g. LDAP) is easy enough, however. Basically, ASN.1 statements can be understood as a simple, elegant DTD language. The "document" equivalent is not text, but the binary image defined by applying the BER to ASN.1 datatypes. In this context, ASN.1 itself doesn't need a mime type. But the protocols developed with ASN.1 do. So, I would use well-formed XML for basic, web based data transfer. Schemas/DTD/SGML for complex document work. LDAP/X.500 for inter-operable data exchange. SNMP for inter-operable device/application management. ASN.1 based private protocol for custom application work. Best regards, Charles Reitzel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Apr 14 18:45:58 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:18 2004 Subject: style standards (was Congrats Jon Bosak and Tim Bray) In-Reply-To: <84285D7CF8E9D2119B1100805FD40F9F25515A@MDYNYCMSX1> Message-ID: <199904141645.MAA16746@hesketh.net> At 11:28 AM 4/14/99 -0400, Bob DuCharme wrote: >>Nice article, though once again CSS is ignored and the strange new >creature >>known as XSL is promoted as "The standard now being developed for XML >>stylesheets", a misuse of the definite article 'the'. > >Considering the qualifier following the definite article, either they >did use the definite article properly or CSS is "now being developed." >Since CSS2 is a Recommendation, I assume you're referring to CSS3, but I >couldn't find anything about it on www.w3.org. Where can I find out >more? CSS is: a) a standard (in as much as anything W3C is a standard) b) still in continuing development, according to various sources on XSL-list and www-styles It's entirely possible that CSS is dead and XSL is the only road forward for XML, but if that's so, the W3C is being awfully quiet about it. All that business about them 'not competing', etc. And CSS is quite definitely for XML as well as HTML - the styles area at the W3C and the CSS2 Rec make that _very_ clear. Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Apr 14 18:51:21 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:11:18 2004 Subject: Suggestions References: <3FF1D0E3A6ACD11192BB0001FA16A73A0162CCBE@saif.com> Message-ID: <006a01be8695$b1251880$0300000a@cygnus.uwa.edu.au> > Thanks for the suggestions, James > > Yes, this is 80% of what I'm looking for. The other 20% is: > > If I could 'include' a reference to the optional paragraph (which will live > in a database) instead of directly embedding it in the XSL template (I want > to reuse the same optional paragraph in a number of documents). > > What if I store the paragraph references as database selects in the XSL > template and do the work of retrieving them from the database before the XSL > engine does its work? Is there a better way? What you might do is "construct" the XSL stylesheet by piecing together things in the database. You can do this just with your file system by using entities, seeing as XSL stylesheets are XML documents. > It sounds like it's time to read up on XSL. You won't regret it :-) James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Wed Apr 14 19:02:00 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:19 2004 Subject: ASN.1 References: <199904141620.MAA20017@chmls05.mediaone.net> Message-ID: <3714C9CA.2AE67BBB@pacbell.net> > I would suggest to folks who need this level of data type reuse and/or > sophistication that they look into ASN.1 ... Perhaps most important, > all datatypes defined using ASN.1 have a standard, unambiguous method of > encoding the data for transmission on the wire. These rules are known as > the "Basic Encoding Rules", or BER, of ASN.1. A handful of points there. First, ASN.1 is generally accepted to be far more complex than is justifiable for most applications. Second, there are multiple syntaxes (the newer one is more cryptic than the original). Third, BER is not the only standardized encoding ... there's also DER, which is a bit more widely used. (X.509 certs use BER, but most everything else uses DER ... think of BER as "canonical DER".) Choices, choices. And fourth, DER and BER are examples of a philosophy of protocol development that's been largely discredited for mainstream applications: "bitstuffing". It was a design principle that bit efficiency was more important than time spent to encode or decode ... perhaps understandable for systems using X.25 networks where you more or less paid by the byte, but not on a LAN or even the Internet. Many folk think DER/BER should be the first to be put against the wall when the revolution (XML?) comes; they're that unpleasant to use. > So, I would use ... > ASN.1 based private protocol for custom application work. At the risk of touching off a religious war, I'd not suggest anyone inflict ASN.1 on themselves, ever! Unless they're plugging into an existing system based on it ... and even then, they should think about whether it's practical to replace/supplant that existing system. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Wed Apr 14 19:13:09 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser Message-ID: <199904141843.LAA00065@0000000.com> Being a Java Virtual Machine guy, I can tell you definitively that string processing is something that Java tends to be slow at, especially if you're talking large datasets. That being said, if you're going to do Java and XML, it's smart to put Java wrappers around C/C++ code and let the compiled code do all of the storage-related functions. If you're someone like Sun, that's no big deal because the compiled code is hidden away. Indeed this is the case with many elements of the standard Java JDK/JRE from Sun. To the best of my knowledge though, the string handling classes are pure Java in their frameworks, which can make processing of collections of text pretty slow. Again if you're running server-side Java, usually all this means is that you have to play with your network and add more servers or processors. Doing client-side Java/XML is a different can of worms tho', since you can't just plop another processor into your average box at home. My XML parser is C with java wrappers on top of it for this reason as the target customer is a desktop computer or smaller. Of course, given a Solaris machine in a server role, the C code would perform that much better than the machine next to it running pure Java. The Java incarnation where I work is heavily dependent on SQL servers, a typical situation which externalizes a lot of C-based data handling (a la a Sybase or Oracle SQL server) and uses Java mainly to orchestrate how the data is processed instead of actually processing it much itself. For this reason, the performance _can_ be pretty good and Java/SQL can be a pretty nice match. XML presents a different set of performance considerations since the frameworks are often totally written in Java and rely on code that is also pure Java. It would be interesting to know if the collection classes in Java 2.0 and associated string classes may be moved to C and wrapped in Java. Using this in a Java/XML framework could minimize the performance problems. Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Apr 14 20:25:13 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser References: <000b01be867e$df8b4440$124d8bcf@total.net> Message-ID: <3714DCE8.8F262D@infinet.com> Didier PH Martin wrote: > Hi David, > > > Hmm -- they are somewhat slower than Expat, but that's because they're > running tight code loops in a virtual machine. Still, when I was > testing AElfred on a 166MHZ Pentium NT box back in late 1997, it could > parse about 1MB/second with a good VM and a JIT, and the other good > XML parsers are comparable in speed. > > Granted, Expat (with memory-mapped I/O) is about 10 times as fast as > the faster Java-based XML parser, but that's a very misleading figure: > in fact, the actual parsing usually occupies only a small amount of > the time required for XML processing -- most of the time is usually > taken up by your code that actually does something with the XML. > > Let's assume, then, that XML parsing occupies 10% of your > application's overhead. Even if you could build a parser that is > 1000% faster, you'd still gain only 9% in actual execution speed. > > > > I think that nobody would argue that Java has a lot of virtues that > certainly speed of not one of them. To take your numbers David, If that part > of the application is 10 times faster than any Java parser and that the app > itself is 10 times faster also. The overall throughput is therefore 10 times > faster. Which, in certain circumstances is what's required. We would then > have an apps with a 10 times faster throughput. The state of the art for > Java may change in the future as soon as other players like HP, Novell and > IBM bring to the table their own technology and that Java would finally have > the same competitive environment as other languages have. This could be > beneficial for the language evolution as more brains think on how to improve > the performances. > I think another point to David's findings is that microprocessors are on average becoming much faster and much cheaper every year. The reasons for this are myriad, but the inevitable consequence of this is that organizations and software companies will now buy more hardware to run slower software that is easier to maintain and more portable than buy ultra-efficient software that is targeted to a particular platform. When a new processor costs as much as one software engineer's consulting fee per hour, what kind of costs do you think an organization is more willing to bear? The unfortunate thing about computer science is that it is all based on math and all of the algorithms for basic data structures that have been devised are probably the only ones that will be around for a very long time unless someone comes out with some new mathematical breakthrough. So we are left enslaved to the engineers at chip companies who are left with the responsibility of figuring out ways to squeeze out more silicon per dollar than their competitors. As they say, there are only so many ways to skin a cat (-: Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SCODEB at saif.com Wed Apr 14 20:34:16 1999 From: SCODEB at saif.com (Scott Deboy) Date: Mon Jun 7 17:11:19 2004 Subject: Suggestions Message-ID: <3FF1D0E3A6ACD11192BB0001FA16A73A0162CCC2@saif.com> It sounds like my questions are all FAQs. I'm running through the xml-dev archives (the search engine sucks) and have found a few threads on compound/compositional documents, dynamic DTDs, transclusion and storing XML 'documents' in a relational database. It appears I have a number of choices available when it comes down to actually building XML dynamically, including XLL (TRANSCLUDE?) and XSL. I'm still trying to find a good resource to discuss these issues which obviously appear on the list month after month. Any other points GREATLY appreciated - Thanks again, Scott > ---------- > From: James Tauber[SMTP:jtauber@jtauber.com] > Sent: April 14, 1999 9:31 AM > To: Scott Deboy > Cc: XML-Dev Mailing list > Subject: Re: Suggestions > > > Thanks for the suggestions, James > > > > Yes, this is 80% of what I'm looking for. The other 20% is: > > > > If I could 'include' a reference to the optional paragraph (which will > live > > in a database) instead of directly embedding it in the XSL template (I > want > > to reuse the same optional paragraph in a number of documents). > > > > What if I store the paragraph references as database selects in the XSL > > template and do the work of retrieving them from the database before the > XSL > > engine does its work? Is there a better way? > > What you might do is "construct" the XSL stylesheet by piecing together > things in the database. You can do this just with your file system by > using > entities, seeing as XSL stylesheets are XML documents. > > > It sounds like it's time to read up on XSL. > > You won't regret it :-) > > James > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Apr 14 21:45:31 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:19 2004 Subject: Megginson and XMLNews References: <3.0.5.32.19990412100407.00bfd4a0@corp> <370FB644.6F92D093@w3.org> <3.0.5.32.19990407163917.00bd2a60@corp> <14095.52616.339600.100498@localhost.localdomain> <3.0.5.32.19990412100407.00bfd4a0@corp> <3.0.5.32.19990413101500.00ce5cd0@corp> Message-ID: <3714E182.D920D348@prescod.net> Walter Underwood wrote: > > It probably gets us all the way there if the elements are used > the same way. This doesn't require any formal equivalence between > names, just a convention that things named the same work the > same. In other words, a SmallTalk object protocol is sufficient > here; there is no necessity for Java's Interface type. Other tools > may find that useful, but it is not necessary for search engines. Company A has a document type for encoding information conforming to their domain (e.g. "author"). Company B has another document type that uses different names (e.g. "creator"). They merge. Now we need to do searches of both repositories despite the fact that things are named and even structured differently in the two repositories. This strikes me as the sort of problem that a robust search engine should solve. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco By lumping computers and televisions together, as if they exerted a single malign influence, pessimists have tried to argue that the electronic revolution spells the end of the sort of literate culture that began with Gutenberg?s press. On several counts, that now seems the reverse of the truth. http://www.economist.com/editorial/freeforall/19-12-98/index_xm0015.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Wed Apr 14 22:59:46 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:11:19 2004 Subject: style standards (was Congrats Jon Bosak and Tim Bray) Message-ID: <77A952A6B467D211855D00805F9521F11493F9@cnet10.cnet.com> Simon, > -----Original Message----- > From: Simon St.Laurent [mailto:simonstl@simonstl.com] > Sent: Wednesday, April 14, 1999 9:49 AM > To: XML-Dev Mailing list > Subject: style standards (was Congrats Jon Bosak and Tim Bray) > CSS is: > > a) a standard (in as much as anything W3C is a standard) > b) still in continuing development, according to various sources on > XSL-list and www-styles Everything about the web is continuously under development. The 'under construction' sign is always out, and probably always will be, at least for the foreseeable future. Certainly one of the main problems with W3C standards is that they are always 'under construction', leading to instability in the area of standardization. CSS is a case in point. We've developed a methodology for the internet where 'standards' are created through the process of iterative development, with implicit assumptions about backwards compatibility of the standards themselves and the products based on them. This is probably a reasonable response to an unreasonably rapid rate of technological change. But we cannot at the same time argue that flaws in current versions are going to be fixed in future iterations and expect users to accept this. This is the author's argument. Being 'under construction' cannot be used as a defense. Regards, D- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Wed Apr 14 23:13:00 1999 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser Message-ID: <61DAD58E8F4ED211AC8400A0C9B468731AACC2@THOR> It would be interesting to see what benefit something like Instantiations' (http://www.instantiations.com) JOVE high performance Java environment would have on a XML parsing benchmark. They haven't externally released their JOVE product (they have a bundling agreement with Inprise/Borland), but are apparently using it in house and have a crippled version (only builds javac) on their web site. Basically, it takes the .class files for a Java application, analyzes the snot out of it (de-virtualizing calls when their is only one implementation, etc) and builds a native NT EXE. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Thu Apr 15 00:01:46 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser In-Reply-To: <61DAD58E8F4ED211AC8400A0C9B468731AACC2@THOR> Message-ID: <002201be86bf$abcd8140$124d8bcf@total.net> Hi Arnold, Do you mean that it creates native code? If yes, here is the answer to slooooow speed. In fact, you remind me of something. I always tried to find why, there is no way to install ByteCodes and have an option to have it translated for my particular processor. So the code delivery could be bytecode but have the opportunity to get it "cached" on my machine as native code. This is one the mysteries of life like why aren't we all millionaires? :-) regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Arnold, Curt Sent: Wednesday, April 14, 1999 5:11 PM To: 'xml-dev@ic.ac.uk' Subject: RE: recursion in XML parser It would be interesting to see what benefit something like Instantiations' (http://www.instantiations.com) JOVE high performance Java environment would have on a XML parsing benchmark. They haven't externally released their JOVE product (they have a bundling agreement with Inprise/Borland), but are apparently using it in house and have a crippled version (only builds javac) on their web site. Basically, it takes the .class files for a Java application, analyzes the snot out of it (de-virtualizing calls when their is only one implementation, etc) and builds a native NT EXE. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Thu Apr 15 00:27:55 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:11:19 2004 Subject: Good News about TEI Message-ID: For readers who can appreciate the extreme importance of the pioneering SGML research effected through the Text Encoding Initiative (TEI), there's some very good news about the TEI's future. The TEI work began in 1987, and the 1994 Guidelines have been adopted widely. Michael Sperberg-McQueen, now co-chair of the W3C XML Schema Working Group, was the (North American) editor of the TEI Guidelines. See the 'What's New' document of the SGML/XML Web Page for an overview, and references: http://www.oasis-open.org/cover/sgmlnew.html And, for readers now musing upon mechanisms for creating "custom" DTDs as documented in the recently-released W3C WD specification "Modularization of XHTML" (available from the W3C as http://www.w3.org/TR/xhtml-modularization), the TEI facility for customization may be of interest. See, as referenced in the above-mentioned news entry, http://firth.natcorp.ox.ac.uk/TEI/nupizza.htm Cheers, Robin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jhoward at requisite.com Thu Apr 15 01:08:05 1999 From: jhoward at requisite.com (Jerry Howard) Date: Mon Jun 7 17:11:19 2004 Subject: A DTD puzzle: Need clarification Message-ID: <37151DE6.6FFAD9FA@requisite.com> Looking for some DTD clarification. In my current DTD, I have the following element: I am trying to strengthen my DTD to allow for either FIRST, LAST, or both, but I don't want the name tag to contain an empty value. The DTD above allows me to have this, but also allows for a NAME tag to be empty. I have tried the following statement below(along with several variations), but several xml parsers (IBM, Sun, ...etc) have not accepted it I have browsed through several XML and SGML books looking for some assistance, but haven't found it. Any hints? Thanks, Jerry Howard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Thu Apr 15 01:20:27 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:19 2004 Subject: A DTD puzzle: Need clarification In-Reply-To: Jerry Howard's message of Wed, 14 Apr 1999 16:59:50 -0600 Message-ID: <12898.199904142320@doyle.cogsci.ed.ac.uk> > The trouble with this is that it's not deterministic. When you get a FIRSTNAME, which one is it? This is a deterministic version: -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 15 04:56:35 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser Message-ID: <3.0.32.19990414195326.012ec020@pop.intergate.bc.ca> At 03:11 PM 4/14/99 -0600, Arnold, Curt wrote: >It would be interesting to see what benefit something like Instantiations' >(http://www.instantiations.com) JOVE high performance Java environment would >have on a XML parsing benchmark. I just downloaded the new JDK1.1.7 from java.ibm.com, and while I haven't done any quantitative testing, I can say that it is qualitatively *damn* fast, noticeably quicker on everything I do. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 15 05:01:09 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:19 2004 Subject: Congrats Jon Bosak and Tim Bray Message-ID: <3.0.32.19990414195945.012c3b60@pop.intergate.bc.ca> At 04:38 PM 4/13/99 -0400, Simon St.Laurent wrote: >Nice article, though once again CSS is ignored and the strange new creature >known as XSL is promoted as "The standard now being developed for XML >stylesheets", a misuse of the definite article 'the'. Actually, I think this criticism is justified; the article would have been better had we pointed out that there is already a perfectly-good stylesheet language in place (if imperfectly implemented). I can say that the editorial process was somewhat fraught and space at a definite premium, but still. This is particularly ironic since I have been using every other pulpit at my disposal (cf xml.com, conference keynotes, webstandards.org) more or less continuously for the last year to call for conformant XML+CSS+DOM now, and leave the future to the future. Having said that, I think that the quoted phrase is a reasonably accurate description of XSL. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Thu Apr 15 05:12:34 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser In-Reply-To: <14100.48776.595362.585073@localhost.localdomain>; from David Megginson on Wed, Apr 14, 1999 at 09:16:06AM -0700 References: <14099.53301.541630.649928@localhost.localdomain> <000b01be867e$df8b4440$124d8bcf@total.net> <14100.48776.595362.585073@localhost.localdomain> Message-ID: <19990415131218.B16021@io.mds.rmit.edu.au> On Wed, Apr 14, 1999 at 09:16:06AM -0700, David Megginson wrote: > Didier PH Martin writes: > > > I think that nobody would argue that Java has a lot of virtues that > > certainly speed of not one of them. To take your numbers David, If > > that part of the application is 10 times faster than any Java > > parser and that the app itself is 10 times faster also. The overall > > throughput is therefore 10 times faster. > > That's not necessarily the case -- C/C++ have some advantages for fast > I/O that Java doesn't share, but if your other code is not I/O-bound, > and if it doesn't require small, tight processing loops, the speed > difference for the non-parsing code might be much less significant > (depending on how efficient your VM and OS are at memory-management). Don't forget string handling. C/C++ handle strings significantly faster than Java, and this is generally what one would expect to find in an application who's domain involves parsing XML. One other thing does perplex me. I would have expected I/O bound behaviour to level Java and C/C++ rather increase the disparity. I'd be interested to know the details. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Thu Apr 15 06:23:10 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:11:19 2004 Subject: recursion in XML parser References: <14099.53301.541630.649928@localhost.localdomain> <000b01be867e$df8b4440$124d8bcf@total.net> <14100.48776.595362.585073@localhost.localdomain> <19990415131218.B16021@io.mds.rmit.edu.au> Message-ID: <371569AA.974D1BB@jfinity.com> Marcelo Cantos wrote: > Don't forget string handling. C/C++ handle strings significantly > faster than Java, and this is generally what one would expect to find > in an application who's domain involves parsing XML. > > One other thing does perplex me. I would have expected I/O bound > behaviour to level Java and C/C++ rather increase the disparity. I'd > be interested to know the details. Java does not expose many of the I/O capabilities that are synonymous with high performance. Examples include memory mapped files and asynchronous I/O. Heck, it doesn't even expose non-blocking I/O. Even ignoring these ommisions, there are other issues with the core libraries that cause lower performance. Allan Heydon and Marc Najork of the Mercator project (see url below) have a posted a paper titled "Performance Limitations of the Java Core Libraries" that discuss both string and I/O related problems (among others). see bottom of page at: http://www.research.digital.com/SRC/mercator/research.html Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From reschke at medicaldataservice.de Thu Apr 15 09:47:36 1999 From: reschke at medicaldataservice.de (Julian Reschke) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 In-Reply-To: <3714C9CA.2AE67BBB@pacbell.net> Message-ID: <001601be8714$1e2ace10$cf00a8c0@nbreschke> > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > David Brownell > Sent: Wednesday, April 14, 1999 7:01 PM > To: xml-dev@ic.ac.uk > Subject: Re: ASN.1 > >... > > And fourth, DER and BER are examples of a philosophy of protocol > development > that's been largely discredited for mainstream applications: > "bitstuffing". > It was a design principle that bit efficiency was more important than time > spent to encode or decode ... perhaps understandable for systems > using X.25 > networks where you more or less paid by the byte, but not on a LAN or even > the Internet. Many folk think DER/BER should be the first to be > put against > the wall when the revolution (XML?) comes; they're that unpleasant to use. > > ... I would be extremely careful with this. There will always be a reason to stick as much as data as possible into a your byte stream. Right now people pay a premium in both performance and price for IP over cell phones, and even if this gets better in a few years from now, there will always be yet another case where you want optimal usage of your bandwidth (IP over satellites for example). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Thu Apr 15 10:14:28 1999 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:11:20 2004 Subject: CD-ROM of list archives Message-ID: >Henry Rzepa, >... In looking >for the archive address I noticed that the archives are available on >CD-ROM? Amazon.com did not find a match for the ISBN number listed >"981-02-3594-1." I have a LAN connection at work and a cable modem at >home so accessing the archives is not a serious problem but there are >service outages and congestion on the Net from time to time. Can you >provide further information on the CD-ROM version? In case anyone else wants this information, the CD can be purchased via. http://www.wspc.com.sg/books/chemistry/p157.html I am currently mystified why the ISBN I was originally given ISBN 981-02-3594-1 is entirely different to the one that appears on the Web page above ISBN 1-86094-183-4. Neither shows up in Amazon, but given our publisher also has their own online ordering, I am not too surprised. Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY; mailto:rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804. URL: http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Thu Apr 15 12:44:19 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:20 2004 Subject: problem with IE5 References: <2F2DC5CE035DD1118C8E00805FFE354C0F36271B@RED-MSG-56> <37110012.426FBAD8@eng.sun.com> Message-ID: <3714D04F.F9131641@w3.org> David Brownell wrote: > > Looks to me like: > > (a) IE5 uses a nonvalidating XML 1.0 parser (modulo bugs) > for documents it tries to display; Yes. Further, it is a parser which uses the full infoset - external DTD subset and external entities are fetched. > (b) IE5 however REQUIRES conformance to the namespace spec, > and thus rejects some well formed XML 1.0 documents, > such as Richard's original; I don't actually have a problem with that; the XML namespace spec extends the XML 1.0 spec, and the XML 1.0 spec gave fair warning about colon-containing names and upcoming namespace work. > (c) It also REQUIRES any "xmlns*" attributes found in a DTD > to be #FIXED (which is good style) and so rejects documents > which don't have #FIXED, yet conform to the namespace spec; Good style is one thing, but if the XML ns spec does not require fixed then their parser should not either (I am not online at the moment so unable to check the spec). > (d) It also REQUIRES a redundant declaration of such xmlns > attributes on elements, even in cases where the XML 1.0 > specification requires the #FIXED default to be provided > from the processor (and the namespace spec requires it > to be used, effectively 'inherited'); Can you post an example, for checking with other parsers too? > (e) It has some other conformance issue, where the namespace > declaration on just the "test" element doesn't work. This > might be related to the issue (d) above. An example would help here also. > Chris -- is this basically accurate? > > > > > > > > > ]> > > 123 I found that the redundant declaration in the instance was not required by the XML processors that I had available. There are some examples in the new (12 April) SVG spec [1] where the namespace declaration is in the DTD and is not repeated in the instance; when viewed in IE5, the source code is displayed (since there is no stylesheet PI) and the namespace attribute turns up as do the assorted XLink attributes which are also declared just once in the DTD. [1] http://www.w3.org/TR/WD-SVG -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Thu Apr 15 16:26:28 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 References: <001601be8714$1e2ace10$cf00a8c0@nbreschke> Message-ID: <3715F6D3.AF8325BE@pacbell.net> Julian Reschke wrote: > > > And fourth, DER and BER are examples of a philosophy of protocol > > development that's been largely discredited for mainstream applications: > > "bitstuffing". ... > > Many folk think DER/BER should be the first to be > > put against the wall when the revolution (XML?) comes; they're > > that unpleasant to use. > > ... > > I would be extremely careful with this. There will always be a reason to > stick as much as data as possible into a your byte stream. Right now people > pay a premium in both performance and price for IP over cell phones, That isn't "mainstream", either in quantity or bitrate (14.4Kb). Consider, though: the DER encoding of 32 bit numbers takes more than 32 bits. One aspect of "bitstuffing" the DER/BER way is what one might call "The Joy of BitFields" -- used all over the place to flag and tag data. In this case, bitstuffed != space-efficient. Note that I wasn't advocating _wasting_ bandwidth. There are systems where bandwidth is a critical resource ... and it's never completely free, so it shouldn't be wasted. That's not the same as designing a protocol down at the bit level. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From creitzel at mediaone.net Thu Apr 15 19:21:06 1999 From: creitzel at mediaone.net (Charles Reitzel) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 Message-ID: <199904151720.NAA08903@chmls06.mediaone.net> David Brownell wrote: >Charles Reitzel wrote: >> I would suggest to folks who need this level of data type reuse and/or >> sophistication that they look into ASN.1 ... Perhaps most important, >> all datatypes defined using ASN.1 have a standard, unambiguous method of >> encoding the data for transmission on the wire. These rules are known as >> the "Basic Encoding Rules", or BER, of ASN.1. > >A handful of points there. First, ASN.1 is generally accepted to be far >more complex than is justifiable for most applications. Second, there >are multiple syntaxes (the newer one is more cryptic than the original). >Third, BER is not the only standardized encoding ... there's also DER, which >is a bit more widely used. (X.509 certs use BER, but most everything else >uses DER ... think of BER as "canonical DER".) Choices, choices. Never heard of DER. X.500, SSL, LDAP and SNMP all use BER. BER is really quite simple (or at least the subset needed by the above protocols). Open Source libs to encode/decode BER abound. It boils down to defining a decent set of primitive types and their respective on-the-wire images. Complex types are sequences of the primitive types. Is that a bad thing? Perhaps not appropriate for some applications, hence XML. >And fourth, DER and BER are examples of a philosophy of protocol development >that's been largely discredited for mainstream applications: "bitstuffing". >It was a design principle that bit efficiency was more important than time >spent to encode or decode ... perhaps understandable for systems using X.25 >networks where you more or less paid by the byte, but not on a LAN or even >the Internet. Many folk think DER/BER should be the first to be put against >the wall when the revolution (XML?) comes; they're that unpleasant to use. Never heard of bit stuffing either. Philosophy is the least of my concerns. Rather, I care about the ability to handle standard and custom data types in a hetergeneous network of cooperating applications! >> So, I would use ... >> ASN.1 based private protocol for custom application work. > >At the risk of touching off a religious war, I'd not suggest anyone >inflict ASN.1 on themselves, ever! Unless they're plugging into an >existing system based on it ... and even then, they should think about >whether it's practical to replace/supplant that existing system. Well reasonable people can agree to disagree. Clearly, there are many people who feel as you do. No doubt the full blown ASN.1 is a bit of a bear. I don't think any one protocol uses all of it. It has solved, in an elegant way, some of the problems that remain ugly in XML (to whit FPI's and namespaces, encoding of standard datatypes: numeric, date, time, etc.). Anyone who can write Java or C++ will have no problem learning to write ASN.1 statements. I submit that it is an expiditious approach to application protocol development. Bottom line, I'm saying leave that messy inheritance stuff out of XML because it's already been done in a standard way by ASN.1. Best regards, Charles Reitzel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 15 20:09:35 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:20 2004 Subject: recursion in XML parser Message-ID: <93CB64052F94D211BC5D0010A80013310EB400@WWMESS3.172.19.125.2> > My reasoning for not using recursion was performance > (function call/stack framing > considerations) and that it made the code easier to understand. Did not-using-recursion make it faster? I'd be surprised. The superstition that recursion is slow dates back to COBOL and IBM 360 days, i.e. to machine architectures with very inefficient memory architectures and subroutine calls. I don't know much about the Java VM, but I doubt it shares those characteristics. "Easier to understand" is obviously in the eye of the beholder, but in my view recursive algorithms are usually far easier to understand than their non-recursive equivalents. > > It would be interesting to do some benchmarks on various > parsers out there to measure performance. The Java parsers I've tested (Sun, IBM) > are _dog_ slow compared to expat, etc. How slow is a dog (greyhounds are quite fast)? What kind of factor are you talking about? A lot depends on your Java VM implementation. My experience is that most of the mill is used in my application, not in the parser. > For server-side I don't think that matters, since in the corporate scene people > tend to just add more servers/infrastructure and not worry about performance. Not true when you're serving a million pages a day! Except that in that scenario we cache the rendered pages so we don't keep re-rendering the same thing. But a factor of 2 in performance is definitely worth investing in. > > Client-side XML is a completely different kettle o' fish tho' > since you can't just keep popping in processors every time your machine at > home/work bogs down. On the contrary: you've got a rather strange client configuration if it takes longer to render a page on your machine than to download it. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 15 20:25:03 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 Message-ID: <93CB64052F94D211BC5D0010A80013310EB401@WWMESS3.172.19.125.2> > At the risk of touching off a religious war, I'd not suggest anyone > inflict ASN.1 on themselves, ever! As one who has used ASN.1 in the past and uses XML now, I agree absolutely. Though I do regret that the designers of XML didn't include some of the nicer features of ASN.1, specifically data typing. There are actually two problems with the "bitstuffing" style of encoding: firstly it's terribly hard to write the code to generate it and parse it; secondly you only have to get one bit wrong and the guy at the other end can't extract anything from your message. We spent years getting X.400 mail systems to interoperate as a result. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Apr 16 00:00:59 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 and XML References: <199904151720.NAA08903@chmls06.mediaone.net> Message-ID: <37165C68.DE2FBF2A@prescod.net> Charles Reitzel wrote: > > > Bottom line, I'm saying leave that messy inheritance stuff out of XML > because it's already been done in a standard way by ASN.1. But documents need the "messy inheritance stuff" as much as data protocols do. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "In spite of everything I still believe that people are basically good at heart." - Anne Frank xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Apr 16 00:28:32 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 In-Reply-To: <199904151720.NAA08903@chmls06.mediaone.net>; from Charles Reitzel on Thu, Apr 15, 1999 at 01:20:51PM -0400 References: <199904151720.NAA08903@chmls06.mediaone.net> Message-ID: <19990416082815.A3339@io.mds.rmit.edu.au> On Thu, Apr 15, 1999 at 01:20:51PM -0400, Charles Reitzel wrote: > David Brownell wrote: > >Charles Reitzel wrote: > >> I would suggest to folks who need this level of data type reuse and/or > >> sophistication that they look into ASN.1 ... Perhaps most important, > >> all datatypes defined using ASN.1 have a standard, unambiguous method of > >> encoding the data for transmission on the wire. These rules are known as > >> the "Basic Encoding Rules", or BER, of ASN.1. > > > >A handful of points there. First, ASN.1 is generally accepted to be far > >more complex than is justifiable for most applications. Second, there > >are multiple syntaxes (the newer one is more cryptic than the original). > >Third, BER is not the only standardized encoding ... there's also DER, which > >is a bit more widely used. (X.509 certs use BER, but most everything else > >uses DER ... think of BER as "canonical DER".) Choices, choices. I would have thought DER was "canonical BER". DER stands for Definitive Encoding Rules, doesn't it? > Never heard of DER. X.500, SSL, LDAP and SNMP all use BER. BER is really Don't forget Z39.50 (a very important protocol in the retrieval world). It also uses BER (though I don't know if it mandates it). Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Apr 16 00:58:42 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:11:20 2004 Subject: A DTD puzzle: Need clarification Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF32C@RED-MSG-08> The XML specification contains a non-normative (meaning "not-a-requirement") section dealing with non-deterministic content models. Perhaps the parsers in question enforce determinism. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 16 01:08:49 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:11:20 2004 Subject: A DTD puzzle: Need clarification Message-ID: <3.0.32.19990415160726.0122db50@pop.intergate.bc.ca> At 03:57 PM 4/15/99 -0700, Andrew Layman wrote: >The XML specification contains a non-normative (meaning "not-a-requirement") >section dealing with non-deterministic content models. Perhaps the parsers >in question enforce determinism. I think it's normative. I don't agree with it and think it should be done away with, but I do think it's normative. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Fri Apr 16 01:44:58 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:11:20 2004 Subject: ASN.1 In-Reply-To: <19990416082815.A3339@io.mds.rmit.edu.au> References: <199904151720.NAA08903@chmls06.mediaone.net> <199904151720.NAA08903@chmls06.mediaone.net> Message-ID: <3.0.5.32.19990415163158.01f66b00@library.berkeley.edu> At 08:28 AM 4/16/1999 +1000, Marcelo Cantos wrote: >Don't forget Z39.50 (a very important protocol in the retrieval >world). It also uses BER (though I don't know if it mandates it). > I believe it does; at least, in its definitions section it defines ASN.1 as specified in both ISO 8824 (the ASN.1 standard) and ISO 8825 (the BER standard). Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amk1 at erols.com Fri Apr 16 02:36:13 1999 From: amk1 at erols.com (A.M. Kuchling) Date: Mon Jun 7 17:11:20 2004 Subject: ANNOUNCE: Python/XML release 0.5.1 Message-ID: <199904160041.UAA00458@207-172-56-165.s165.tnt12.ann.va.dialup.rcn.com> Version 0.5.1 of the Python/XML distribution is now available. It should be considered a beta release, and can be downloaded from the following URLs: http://www.python.org/sigs/xml-sig/files/xml-0.5.1.tgz http://www.python.org/sigs/xml-sig/files/xml051.zip Changes in this version: * A sizable DOM test suite has been written. The test suite turned up a number of bugs which have all been fixed, which greatly increases my confidence in PyDOM's compliance with the DOM Recommendation. Test suites have also been added for various other modules. * Added marshalling into various XML-based formats: a generic one for Python objects, WDDX, and XML-RPC. The generic marshaller can be subclassed to implement marshalling into a specific DTD; both the WDDX and XML-RPC marshallers were implemented in this fashion. * Collected the licences for everything into a LICENCE file. * Various subpackages (sgmlop, xmlarch, xmlproc, expat) have been upgraded to their most recent versions. The Python/XML distribution contains the basic tools required for processing XML data using the Python programming language, assembled into one easy-to-install package. The distribution includes parsers and standard interfaces such as SAX and DOM, along with various other useful modules. The package currently contains: * XML parsers: Pyexpat (Jack Jansen), xmlproc (Lars Marius Garshol), xmllib.py (Sjoerd Mullender) using the sgmlop.c accelerator module (Fredrik Lundh). * SAX interface (Lars Marius Garshol) * DOM interface (Stefane Fermigier, A.M. Kuchling) * xmlarch.py, for architectural forms processing (Geir Ove Gr�nmo) * Unicode wide-string module (Martin von L�wis) * Various utility modules and functions (various people) * Documentation and example programs (various people) The code is being developed bazaar-style by contributors from the Python XML Special Interest Group, so please send comments, questions, or bug reports to . For more information about Python and XML, see: http://www.python.org/topics/xml/ -- A.M. Kuchling http://starship.python.net/crew/amk/ The days come and go like muffled and veiled figures sent from a distant friendly party, but they say nothing, and if we do not use the gifts they bring, they carry them as silently away. -- Ralph Waldo Emerson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Fri Apr 16 03:54:58 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:20 2004 Subject: A DTD puzzle: Need clarification References: <3.0.32.19990415160726.0122db50@pop.intergate.bc.ca> Message-ID: <37169808.41AE0343@pacbell.net> Tim Bray wrote: > > At 03:57 PM 4/15/99 -0700, Andrew Layman wrote: > >The XML specification contains a non-normative (meaning "not-a-requirement") > >section dealing with non-deterministic content models. Perhaps the parsers > >in question enforce determinism. > > I think it's normative. I don't agree with it and think it should be > done away with, but I do think it's normative. -Tim In section 3.2.1 it says "for compatibility, it is an error..." and points at Appendix E, "Non-Deterministic Content Models (Non-Normative)" for details. Seems to me that 3.2.1 is normative. However, in translation, that means that processors may do pretty much whatever they like there ... but documents mustn't depend on such DTD constructs ever working, or even being detected by an XML processor. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Apr 16 06:01:07 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:11:20 2004 Subject: recursion in XML parser In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB400@WWMESS3.172.19.125.2>; from Kay Michael on Thu, Apr 15, 1999 at 07:06:37PM +0100 References: <93CB64052F94D211BC5D0010A80013310EB400@WWMESS3.172.19.125.2> Message-ID: <19990416140037.B3339@io.mds.rmit.edu.au> On Thu, Apr 15, 1999 at 07:06:37PM +0100, Kay Michael wrote: > > My reasoning for not using recursion was performance (function > > call/stack framing considerations) and that it made the code > > easier to understand. > > Did not-using-recursion make it faster? I'd be surprised. The > superstition that recursion is slow dates back to COBOL and IBM 360 > days, i.e. to machine architectures with very inefficient memory > architectures and subroutine calls. I don't know much about the Java > VM, but I doubt it shares those characteristics. I can't speak for the JVM, but it is far from safe to generalise and state that a function call is as fast as a stack push, particularly when the programmer knows exactly what needs to be pushed. Moreover, modern architectures often penalise you heavily for deep recursion. For instance, the SPARC architecture uses register windowing. The register window allows access to three register groups (something like in, local, out). Passing parameters to a function in this scheme typically involves setting the out registers as required and moving the register window down by two groups. Consequently, when the subroutine is called, the calling function's out registers become the called function's in registers. This technique is blindingly fast, until you run off the end of the register stack. When that happens, a register trap is invoked which saves two complete register groups out to main memory to free up more space. In short, when you recurse too deeply and too frequently, you will be hitting main memory in a big way and performance will suffer enormously. In shorter, recursion is not necessarily fast, even today. > "Easier to understand" is obviously in the eye of the beholder, but > in my view recursive algorithms are usually far easier to understand > than their non-recursive equivalents. I don't think anyone disagrees here, but sometimes performance is far more important than clarity. > > It would be interesting to do some benchmarks on various parsers > > out there to measure performance. The Java parsers I've tested > > (Sun, IBM) are _dog_ slow compared to expat, etc. > > How slow is a dog (greyhounds are quite fast)? What kind of factor > are you talking about? A lot depends on your Java VM implementation. > My experience is that most of the mill is used in my application, > not in the parser. But with Java you suffer (vs. C/C++) across the board, not just in the parser. > > For server-side I don't think that matters, since in the corporate > > scene people tend to just add more servers/infrastructure and not > > worry about performance. > > Not true when you're serving a million pages a day! Except that in > that scenario we cache the rendered pages so we don't keep > re-rendering the same thing. But a factor of 2 in performance is > definitely worth investing in. It is more true than ever when you are serving a million pages a day. Take AltaVista. Last I heard they had six huge servers dishing out HTML. It is often far less expensive to add hardware than to rearchitect software. > > Client-side XML is a completely different kettle o' fish tho' > > since you can't just keep popping in processors every time your > > machine at home/work bogs down. > > On the contrary: you've got a rather strange client configuration if > it takes longer to render a page on your machine than to download > it. Got some perf. specs on a LAN? I can easily imagine a Java XML parser on the client slowing things down noticeably. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Fri Apr 16 09:52:51 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:21 2004 Subject: recursion in XML parser References: <93CB64052F94D211BC5D0010A80013310EB400@WWMESS3.172.19.125.2> <19990416140037.B3339@io.mds.rmit.edu.au> Message-ID: <3716EC11.6AF1966C@pacbell.net> Marcelo Cantos wrote: > > I can't speak for the JVM, but it is far from safe to generalise and > state that a function call is as fast as a stack push, particularly > when the programmer knows exactly what needs to be pushed. There's no reason a nonvirtual function call shouldn't compile to be just the stack operation. If you're using a virtual function call, the same reasoning applies as for a C++/Obj-C/... virtual function call. Namely, that one can't just say "it's not free"; a comparison needs to include the cost of an alternative with the same functionality. And curiously enough, when you do those comparisons, the functionality seems to be relatively cheaper when packaged as a "virtual function call" than when packaged as an if/then/else/... set of data operations, or other alternatives. This discussion seems pretty odd to me. Exactly what alternative is being advocated? Remember that per-element state _must_ be maintained when parsing XML, and the model is a stack. Whether that stack gets maintained using the CPU stack or explicit emulation in some other memory data structure, it'll be there. Function calls use the CPU stack, and clean up very efficiently. Explicit emulation uses a different memory segment; and needs more work to GC correctly. > Moreover, modern architectures often penalise you heavily for deep > recursion. For instance, the SPARC architecture uses register > windowing. ... Which can be bypassed by modern compilers for those applications where it matters. For example, graphics algorithms tend to need lots of registers (e.g. VIS code) and device drivers need to have predictable latencies (that is, they can't afford to flush windows in a time-critical interrupt handler). In short, that argument doesn't wash; it's a code generation issue, not a problem with recursion. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Fri Apr 16 10:03:11 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:21 2004 Subject: ASN.1 References: <199904151720.NAA08903@chmls06.mediaone.net> <19990416082815.A3339@io.mds.rmit.edu.au> Message-ID: <3716EE80.D0098A3C@pacbell.net> > > >Third, BER is not the only standardized encoding ... there's also DER, which > > >is a bit more widely used. (X.509 certs use BER, but most everything else > > >uses DER ... think of BER as "canonical DER".) Choices, choices. > > I would have thought DER was "canonical BER". DER stands for > Definitive Encoding Rules, doesn't it? "Definite", not "Definitive" ... but right, X.509 uses DER so there is only one way to encode, and DER subsets BER by removing options. Switching those two was a test to see who else knows their stuff! :-) > > Never heard of DER. X.500, SSL, LDAP and SNMP all use BER. X.509 and various other systems (including SSL) use DER. Those of us who've implemented them know it all too well. > > BER is really > > quite simple (or at least the subset needed by the above protocols). Open > > Source libs to encode/decode BER abound. I've seen many of them, and have been amused at how inconsistent their bug sets are. If you want "simple" look at XDR, which has many more such open source libs, generally without any bugs. (Of course in both cases you need to like binary data formats.) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Fri Apr 16 11:53:40 1999 From: nwoh at software-ag.de (Hutchison, Nigel) Date: Mon Jun 7 17:11:21 2004 Subject: ASN.1 Message-ID: <005355AD0596D211B4F30000F81B0D324C97C5@daemsg01.software-ag.de> > -----Original Message----- > From: Julian Reschke [SMTP:reschke@medicaldataservice.de] > Sent: Thursday, April 15, 1999 9:47 AM > To: David Brownell; xml-dev@ic.ac.uk > Subject: RE: ASN.1 > > > > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > > David Brownell > > Sent: Wednesday, April 14, 1999 7:01 PM > > To: xml-dev@ic.ac.uk > > Subject: Re: ASN.1 > > > >... > > > > And fourth, DER and BER are examples of a philosophy of protocol > > development > > that's been largely discredited for mainstream applications: > > "bitstuffing". > > It was a design principle that bit efficiency was more important than > time > > spent to encode or decode ... perhaps understandable for systems > > using X.25 > > networks where you more or less paid by the byte, but not on a LAN or > even > > the Internet. Many folk think DER/BER should be the first to be > > put against > > the wall when the revolution (XML?) comes; they're that unpleasant to > use. > > > > ... > > I would be extremely careful with this. There will always be a reason to > stick as much as data as possible into a your byte stream. Right now > people > pay a premium in both performance and price for IP over cell phones, and > even if this gets better in a few years from now, there will always be yet > another case where you want optimal usage of your bandwidth (IP over > satellites for example). > [Nigel Hutchison] I would have thought that the best way of dealing with this issue is to use a "pleasant" syntax which was easy to process and implement another layer to compress the payload for transmission. Nigel W. O. Hutchison Chief Architect, Software AG Germany Tel: +49 6151 92 1207 Email nwoh@software-ag.de > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990416/128bb1ae/attachment.htm From spreitze at parc.xerox.com Fri Apr 16 15:32:34 1999 From: spreitze at parc.xerox.com (Mike Spreitzer) Date: Mon Jun 7 17:11:21 2004 Subject: ASN.1 In-Reply-To: <005355AD0596D211B4F30000F81B0D324C97C5@daemsg01.software-ag.de> Message-ID: <003901be880d$6e30b1f0$1776020d@phobos.parc.xerox.com> Nigel Hutchison wrote: [[ I would have thought that the best way of dealing with this issue is to use a "pleasant" syntax which was easy to process and implement another layer to compress the payload for transmission. ]] Best in some contexts, but not all. The compression layer has runtime costs, in both code and memory footprint, and processing time. In some contexts (e.g., very resource-constrained items like cell phones and Palm Pilots), these costs can be significant. Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Fri Apr 16 16:57:20 1999 From: nwoh at software-ag.de (Hutchison, Nigel) Date: Mon Jun 7 17:11:21 2004 Subject: ASN.1 Message-ID: <005355AD0596D211B4F30000F81B0D324C97D8@daemsg01.software-ag.de> > -----Original Message----- > From: Mike Spreitzer [SMTP:spreitze@parc.xerox.com] > Sent: Friday, April 16, 1999 3:31 PM > To: 'Hutchison, Nigel'; xml-dev@ic.ac.uk > Subject: RE: ASN.1 > > Nigel Hutchison wrote: [[ > I would have thought that the best way of dealing with this issue is to > use > a "pleasant" syntax which was easy to process and implement another layer > to compress the payload for transmission. > ]] > > Best in some contexts, but not all. The compression layer has runtime > costs, in both code and memory footprint, and processing time. In some > contexts (e.g., very resource-constrained items like cell phones and Palm > Pilots), these costs can be significant. [Nigel Hutchison] I had some experience recently in trying to devise a RPC XML DTD which was nice and compact. So I did also sorts of tricks with short tag names, repetition conventions etc etc. This had the effect of iincreasing the creation and parsing code significantly. I imagine that there is a considerable footprint and CPU penalty is this optimisation. II also found the prospect of testing and debugging this quite daunting. I then realised that when I send and receive XML documents via my analogue modem they are effectively compressed and decompressed by the hardware - so I was wasting my time (in that scenario at least). I would have thought that all cellphones would do compression and decompression when they send and receive data - is that not so? I also found out that Mainframes have firmware supporting LZ compression these days. Regards Nigel Hutchison > Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990416/38166801/attachment.htm From pwilson at gorge.net Fri Apr 16 18:49:07 1999 From: pwilson at gorge.net (Peter Wilson) Date: Mon Jun 7 17:11:21 2004 Subject: ASN.1 References: <93CB64052F94D211BC5D0010A80013310EB401@WWMESS3.172.19.125.2> Message-ID: <371769FB.495E0E7A@GORGE.NET> Hi, I am working an a compiled form of xml (cxml). This parses xml and with DTD/schema information (also in cxml) to produce a file with the equivalent information. The cxml file has the following advantages: 1. Attributes are typed - based on Java basic types int,String etc. 2. Attributes may be extended to simple object types (Point(x,y), Color(red) etc.). 3. No parsing required - very quick to load. The data is written to a DataOutputStream using standard java. The data is read using a DataInputStream using standard java. The data format is extremely simple as it does not need to address all the concerns of a ASN.1 type specification. 4. With deference to David Brownwell - minimum bit stuffing. 5. Name Space references are resolved. 6. All text is stored in UTF8 and resolved to Unicode. 7. Artifacts of xml text representation are removed: Parsed/unparsed text distinction. Text is just text. Character entities are resolved. 8. Less importantly, smaller size about 30-40%. 9. The DTD/schema cxml is contains into two types of information: domain and constraints. The domain information is compiled into the cxml. This ensures that sufficient information to display a document or create a DOM model is contained within the document. The constraints form is referenced and contains sufficient information to edit the document contents. Including allowed element structure and additional attribute validation, cross attribute validation etc. 10. The base interface to the cxml reader is a SAX like event stream. I hope to post a reference to this specification and implementation (open source) within the next two weeks. Planned Extensions: I have an older version of cxml which encodes XML to define a Swing interface and event handling. Using this user interface definition I have basic a browser to view cxml structures directly. An generic editor would not be too difficult. I hope to make these available also. This Swing interface requires an extension to XML to handle Swing components (Java beans). The DTD must be dynamic. Does this element support this attribute having this type when attribute x="y"? It is not feasible to define a DTD which contains all the attributes for all possible java components. The compiler and runtime loader must determine the type and set/get methods for the attributes dynamically. Anyone interested? Regards Peter Wilson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From BPosert at filenet.com Fri Apr 16 19:41:16 1999 From: BPosert at filenet.com (Posert, Bob) Date: Mon Jun 7 17:11:21 2004 Subject: Non-printing character in XML and expat Message-ID: When using expat to parse the following xml: I get an error "test.xml:3:0 reference to an invalid character number" To double-check, I saved the xml as a UNICODE file and ran xmlwf on it; I got the same error. Is this valid XML? It seems like the spec allows it, and MS's xml notepad reads it OK. It looks like the problem in expat might be that checkCharRefNumber is called too often. Many thanks, Bob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 16 19:55:25 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:21 2004 Subject: Non-printing character in XML and expat In-Reply-To: Posert, Bob's message of Fri, 16 Apr 1999 10:40:27 -0700 Message-ID: <199904161754.SAA15638@stevenson.cogsci.ed.ac.uk> > 24 (control-X) is not a legal XML character (production 2), and character references must expand to legal characters (production 66). -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From BPosert at filenet.com Fri Apr 16 20:07:27 1999 From: BPosert at filenet.com (Posert, Bob) Date: Mon Jun 7 17:11:21 2004 Subject: Non-printing character in XML and expat Message-ID: Thank you very much, Richard! If the following is wrong, could someone let me know: >From the archives, it looks like encode/decode with base64 is the way to go. It also looks like this'll be code in the app above expat; there is no support in the current version of XML or expat for base64 encoding. --Bob -----Original Message----- From: Richard Tobin [mailto:richard@cogsci.ed.ac.uk] Sent: Friday, April 16, 1999 10:54 AM To: Posert, Bob; 'xml-dev@ic.ac.uk' Subject: Re: Non-printing character in XML and expat > 24 (control-X) is not a legal XML character (production 2), and character references must expand to legal characters (production 66). -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Apr 16 22:12:50 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:11:21 2004 Subject: Non-printing character in XML and expat In-Reply-To: References: Message-ID: * Bob Posert | | When using expat to parse the following xml: | | | | | I get an error "test.xml:3:0 reference to an invalid character number" The error is correct. Character number 24 is not an allowed character in XML. This is clearly stated by the WFC to production 66 in section 4.1. See section 2.2 (production 2) in the XML recommendation for a listing of the allowed characters in XML. U+0014 is not among them. By the way, did you really mean to refer to the DC4 control code character? If so, why? (I'm asking so that we may be able to suggest alternative solutions.) | MS's xml notepad reads it OK. Then that's a standard violation in XML Notepad. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Fri Apr 16 23:38:13 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:21 2004 Subject: W3C Web servers References: <01BE8500.C1E66A70@grappa.ito.tu-darmstadt.de> Message-ID: <3717AD1D.94893DF9@w3.org> Ronald Bourret wrote: > Just to let everybody know, a number of people have replied privately that > they have had slow or no access since Friday. Presumably, the W3C is > working on it... Working on it furiously. We have had severe filesystem problems all week due to our networked filestore breaking. The content is being served from alternative machines, using local copies; but these are not as fast or load-capable as the normal machines. They are using the same DNS names, so the URLs are the same as usual even if the machines aren't. Normal service will be resumed as soon as possible, and indeed its looking a lot faster now than it was yesterday. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Fri Apr 16 23:58:55 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:11:21 2004 Subject: XML news from Datachannel Message-ID: <77A952A6B467D211855D00805F9521F1149405@cnet10.cnet.com> This might be of interest to XML developers. Enjoy. Regards, D- >DataChannel Releases the Most Advanced XML Parser (XJParser) and Introduces >xDev(tm) its XML Developers Program > >Bellevue, WA - April 16, 1999 - DataChannel announced today the immediate >availability of the latest XJParser(tm), a cornerstone of its XML Toolkit >for XMLFramework(tm) and the XML Developers Program, xDev(tm). Formerly >referred to as the DataChannel XJ2 Parser, co-developed with Microsoft, the >new release adds substantial functionality with significant improvements in >performance. DataChannel's XML Developers Program, xDev(tm) is designed to >showcase developers work and real-world implementations and provide the XML >tools and technology needed to realize the true power of XML. The XJParser >can be downloaded at the new >xDev site at http://www.datachannel.com/xdev. > >XJParser becomes the industry's most complete, reliable, and innovative XML >parser for delivering powerful, server-based enterprise applications >utilizing open web-standards. XJParser is the cornerstone of its XML Toolkit >for XMLFramework(tm). DataChannel's XML Framework is an extensive portfolio >of enterprise solutions designed to capitalize on the Information Economy >through the use of the Extensible Mark-up Language (XML). DataChannel's >joint development agreement with Microsoft allows DataChannel to >independently extend the functionality and performance of XJParser to >cross-platform environments making DataChannel the first company to offer >cross-platform parser functionality. > >With over 5,000 beta users, XJParser has been utilized in a diverse set of >scenarios from small department applications to large commercial enterprise >applications. Csoft International, an advanced applications vendor, is >using XML and the XJParser to create standard data formats >for their applications. They provide state-of-the-art software solutions >for the retail industry, >which gather point-of-sale and other transaction profiling information. >Data originates from various disparate sources and Csoft needed a cutting >edge solution to access these multiple information stores. > >"Our products must be capable of exchanging information with a wide range of >disparate systems from legacy based to architectures using the latest >emerging technologies," said Terry Bissonette, senior software engineer, >CSoft International. "The XJParser showcases XML as an ideal data exchange >medium to the standards bodies and peer vendors in our industry. We have >used it to build a general framework that manages the simultaneous exchange >of XML data with one or more trading partners. We are looking forward to >implementing the new XJParser with its improved performance and are >confident it will extend our capacity to integrate data from any source." > >Extended functionality of XJ Parser includes: > >* XML 1.0 Standards Compliance with a complete set of W3C interfaces >to ensure interoperability with applications and web-based technologies. >* Simple API for XML (SAX) 1.0 to optimize performance parsing large >documents utilizing an event-based API. Additional support for third-party >SAX drivers is also included. >* Integrated eXtensible Stylesheet Language (XSL) processor providing >seamless, integrated XML parsing and transformation in one package. >* Supports and extends full XML functionality in Microsoft Internet >Explorer 5.0. >* Validating support for DTD's and XML schema to verify true and >error-free XML data. >* Exception and Error Handling to handle Java exceptions as well as >trapping errors for developer productivity. >* Advanced query language support through emerging W3C standards >support like data typing, XSL pattern matching, and node transformation. > >Developers will also enjoy tremendous support with comprehensive >documentation, integrated sample libraries, technical support and host of >other benefits introduced in xDev(tm) DataChannel's new developers program. > >XJParser can be downloaded from DataChannel at >http://www.datachannel.com/xdev >Introducing xDev(tm) >DataChannel also announces the XML Developers Program, providing early >adopter developers direct access to leading edge XML resources from cutting >edge XML developers. To provide fellow developers with the best in tools, >techniques and tips, DataChannel has created a developer program >highlighting the latest tools and expertise in the XML arena. DataChannel's >one stop shop for XML knowledge will also include analysis & explanations of >the latest W3C Standards and Proposals, discussion groups on the hottest >topics, and the tools and technologies which simplify XML development. In >addition, DataChannel will offer creative support and co-marketing >arrangements to help developers leverage their ideas towards the advancement >of XML and their professional growth. > >Many of these ideas contributed to the rich functionality of the new >XJParser. These new offerings will be rolled out in xDev and are described >on http://www.datachannel.com/xdev, DataChannel's new xml developer site. >Developers should take the opportunity today to download the XJParser and >give DataChannel input on key elements of xDev and to extend the >standardization and state-of-the-art of XML technology. > >"We continue to push ourselves to remain the innovative leaders in the XML >market space but we cannot do it alone. Our success is dependent upon the >thousands of expert developers that understand this technology and know how >to create real business solutions with it. Although we will lead the way, >the true value of xDev comes from the sharing of ideas and technology >amongst the XML development community. We are committed to creating a world >class forum that can bring together the greatest minds and ideas in the XML >world," said Tim Gelinas, vice president of development who will oversee the >xDev program. > >Overview of DataChannel's XML Framework(TM) >DataChannel's XML Framework is an extensive portfolio of enterprise >solutions designed to capitalize on the Information Economy through the use >of the Extensible Mark-up Language (XML). This XML Framework(tm) creates >universal access to mission-critical structured information as well as the >ever-increasing amount of unstructured data. DataChannel's XML Framework >offering consists of: XML Professional Services, Solutions Software, XML >Tools and Technologies, XML Training, and Support Services. DataChannel >develops and markets enterprise information management solutions to unleash >the power of the enterprise information portal. > >About DataChannel, Inc. >DataChannel Inc., founded in 1996, is a privately held company. It is a >value-added integrator of XML-based, market-leading Information Management >solutions and services that facilitate the way companies share information >with employees, vendors, customers, and consumers across intranets, >extranets and the Internet. These solutions are offered through >DataChannel's XML Framework(tm), an extensive portfolio of enterprise >solutions designed to capitalize on the Information Economy through the use >of the Extensible Markup Language (XML). It includes RIO, an Enterprise >Information Portal solution connecting users of the portal to relevant >information sources. > >DataChannel simplifies the process of delivering critical information to the >right people at the right time. > >All trademarks or registered trademarks are property of their respective >holders in the United States and/or other countries. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sat Apr 17 04:14:30 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:11:21 2004 Subject: recursion in XML parser In-Reply-To: <3716EC11.6AF1966C@pacbell.net>; from David Brownell on Fri, Apr 16, 1999 at 12:51:45AM -0700 References: <93CB64052F94D211BC5D0010A80013310EB400@WWMESS3.172.19.125.2> <19990416140037.B3339@io.mds.rmit.edu.au> <3716EC11.6AF1966C@pacbell.net> Message-ID: <19990417121413.A23112@io.mds.rmit.edu.au> On Fri, Apr 16, 1999 at 12:51:45AM -0700, David Brownell wrote: > Marcelo Cantos wrote: > > > > I can't speak for the JVM, but it is far from safe to generalise and > > state that a function call is as fast as a stack push, particularly > > when the programmer knows exactly what needs to be pushed. > > There's no reason a nonvirtual function call shouldn't compile to be > just the stack operation. > > If you're using a virtual function call, the same reasoning applies > as for a C++/Obj-C/... virtual function call. Namely, that one can't > just say "it's not free"; a comparison needs to include the cost of > an alternative with the same functionality. And curiously enough, I heartily agree. In fact, that was my whole point. Don't just assume that method X is better than method Y in any and all circumstances (which is what the original statement effectively said). > when you do those comparisons, the functionality seems to be > relatively cheaper when packaged as a "virtual function call" than > when packaged as an if/then/else/... set of data operations, or > other alternatives. > > This discussion seems pretty odd to me. Exactly what alternative is > being advocated? None. I wasn't even advocating that XML parsers be implemented non-recursively (too much hard work, frankly). I was merely pointing out that it is dangerous to generalise (I don't see why such a warning would be perceived as odd). We have often encountered situations were manual recursion came out significantly faster than anything the compiler could produce under any optimisation level. Maybe in Java function calls are intrinsically as fast as (or faster than) manual recursion under all conceivable scenarios. I am not a Java expert, so I can't say. I would be surprised to find that this was so, but who knows. > Remember that per-element state _must_ be maintained > when parsing XML, and the model is a stack. Whether that stack gets > maintained using the CPU stack or explicit emulation in some other > memory data structure, it'll be there. Function calls use the CPU > stack, and clean up very efficiently. Explicit emulation uses a > different memory segment; and needs more work to GC correctly. I can see how a GC environment would tilt the scales somewhat (I assume that the GC system knows not to look past the stack pointer, whereas an array implementation would need to null unneeded values). However, I explicitly stated that I was not talking specifically about the JVM, hence it is a little premature to "remind" me of the cost of manual GC management. > > Moreover, modern architectures often penalise you heavily for deep > > recursion. For instance, the SPARC architecture uses register > > windowing. ... > > Which can be bypassed by modern compilers for those applications > where it matters. For example, graphics algorithms tend to need > lots of registers (e.g. VIS code) and device drivers need to have > predictable latencies (that is, they can't afford to flush windows > in a time-critical interrupt handler). I don't see the relevance of this. Graphics algorithms shouldn't use register windowing because they use lots of registers; this has nothing to do with recursion. Device drivers shouldn't use register windowing because they need real-time performance; hence register windowing simply isn't an option. There will still be cases, however, where register windowing provides a significant amortised performance gain, but only if the code is refactored to remove recursive calls. In fact, parsing XML may be just such a case. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dent at oofile.com.au Sat Apr 17 15:04:14 1999 From: dent at oofile.com.au (Andy Dent) Date: Mon Jun 7 17:11:22 2004 Subject: XML Parser in Unix C++ (perhaps DEC UNIX?) In-Reply-To: <8E7905420FB9D211916A00609773FB0E01616AB2@exchange1.echostar.com> Message-ID: >We're looking for a Unix C++ parser to use on a DEC UNIX project... >Note: Not looking for a C XML parser (xpat, grove annex, rxp), but rather a >C++ XML parser. I have a simple c++ wrapper around expat I've called expatpp which now has a page on our web site and a zip archive of the source (as well as the CodeWarrior projects I put up there for the Mac a week or so ago). http://www.highway1.com.au/adsoftware/expatpp.htm Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows PP2MFC - PowerPlant->MFC portability http://www.oofile.com.au/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stark at uplanet.com Sat Apr 17 20:48:19 1999 From: stark at uplanet.com (Peter Stark) Date: Mon Jun 7 17:11:22 2004 Subject: PITarget uniqueness In-Reply-To: <370E55C2.A38B4889@locke.ccil.org> Message-ID: <000601be8902$c270b9e0$e4c3c6c3@uplanet.com> > -----Original Message----- > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of > John Cowan > Sent: Friday, April 09, 1999 12:32 PM > To: XML Dev > Subject: Re: PITarget uniqueness > > > James Tauber wrote: > > PI targets can be associated with a URI using the NOTATION > mechanism. This > > is so much like the namespace mechanism that I'd really like it > if the two > > were merged and the PI target made arbitrary. > > I have a new SAX parser filter in the planning stages called PIEngine. > It will take three actions w.r.t. PIs, selectable by mode switches: > > notation resolution: replace any PI target with the URI > declared in the corresponding notation declaration, provided > there is one. > > character references: convert numeric character references > in PI data to the corresponding characters > > pseudo-elements: decode PIs as if they contained attributes > (a bare name is interpreted as name="name", as in SGML) > and pass them as empty elements with "?" prefixed to the name. Interesting. Shouldn't the pseudo-element have an end-tag, so that both the start and the end of the data to which the PI applies, can be marked? For example, in the eXtensible Protocol, the 'xp'-pseudo-element has : ... I believe this can be useful for other PI elements as well. Peter xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Mon Apr 19 05:23:03 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:11:22 2004 Subject: ASN.1 References: <93CB64052F94D211BC5D0010A80013310EB401@WWMESS3.172.19.125.2> Message-ID: <371AA2E4.EE4BB7B0@lig.net> Kay Michael wrote: > > At the risk of touching off a religious war, I'd not suggest anyone > > inflict ASN.1 on themselves, ever! > > As one who has used ASN.1 in the past and uses XML now, I agree absolutely. > Though I do regret that the designers of XML didn't include some of the > nicer features of ASN.1, specifically data typing. > > There are actually two problems with the "bitstuffing" style of encoding: > firstly it's terribly hard to write the code to generate it and parse it; > secondly you only have to get one bit wrong and the guy at the other end > can't extract anything from your message. We spent years getting X.400 mail > systems to interoperate as a result. I put in my time on X.400/X.500/ISO stacks many years ago... We are SOOO lucky that SMTP/Mime/Internet mail won out. Not to mention TCP/IP vs. ISO stacks... sdw > Mike Kay > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From branjan at wipinfo.soft.net Mon Apr 19 05:30:33 1999 From: branjan at wipinfo.soft.net (Balaji Ranjan) Date: Mon Jun 7 17:11:22 2004 Subject: xml mail Message-ID: hi all, how abt. exchanging mail in xml format on this mailing list.i feel that we can understand xml much better . regards Balaji Ranjan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sdw at lig.net Mon Apr 19 05:44:32 1999 From: sdw at lig.net (Stephen D. Williams) Date: Mon Jun 7 17:11:22 2004 Subject: Compiled XML/Binary XML/Structured XML, Was: Re: ASN.1 References: <93CB64052F94D211BC5D0010A80013310EB401@WWMESS3.172.19.125.2> <371769FB.495E0E7A@GORGE.NET> Message-ID: <371AA7EE.C3362F9F@lig.net> A couple weeks ago a number of us were talking about 'binary XML' or 'bXML' after I brought the topic up. The name is in flux but I'm working on code as time permits before I blab anymore... You have arrived at many of the same conclusions. I'll point out that I believe that cXML is used for Commerce XML or something like that, although I like the connotations of 'compiled'. When we have something working and battle out the details we can choose ;-). Please note that we're not directly in agreement yet, but the basic goals of optimization appear to be compatible. sdw Peter Wilson wrote: > Hi, > I am working an a compiled form of xml (cxml). > This parses xml and with DTD/schema information (also in cxml) to > produce a file with the equivalent information. The cxml file has the > following advantages: > > 1. Attributes are typed - based on Java basic types int,String etc. > 2. Attributes may be extended to simple object types (Point(x,y), > Color(red) etc.). > 3. No parsing required - very quick to load. > The data is written to a DataOutputStream using standard java. > The data is read using a DataInputStream using standard java. > The data format is extremely simple as it does not need to address > all the concerns of a ASN.1 type specification. > 4. With deference to David Brownwell - minimum bit stuffing. > 5. Name Space references are resolved. > 6. All text is stored in UTF8 and resolved to Unicode. > 7. Artifacts of xml text representation are removed: > Parsed/unparsed text distinction. Text is just text. > Character entities are resolved. > 8. Less importantly, smaller size about 30-40%. > 9. The DTD/schema cxml is contains into two types of information: domain > > and constraints. The domain information is compiled into the cxml. This > ensures that sufficient information to display a document or create a > DOM model is contained within the document. The constraints form is > referenced and contains sufficient information to edit the document > contents. Including allowed element structure and additional attribute > validation, cross attribute validation etc. > 10. The base interface to the cxml reader is a SAX like event stream. > > I hope to post a reference to this specification and implementation > (open source) within the next two weeks. > > Planned Extensions: > I have an older version of cxml which encodes XML to define a Swing > interface and event handling. > Using this user interface definition I have basic a browser to view cxml > > structures directly. > An generic editor would not be too difficult. I hope to make these > available also. > > This Swing interface requires an extension to XML to handle Swing > components (Java beans). The DTD must be dynamic. Does this element > support this attribute having this type when attribute x="y"? It is not > feasible to define a DTD which contains all the attributes for all > possible java components. The compiler and runtime loader must determine > > the type and set/get methods for the attributes dynamically. > > Anyone interested? > > Regards Peter Wilson > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Mon Apr 19 06:03:37 1999 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:11:22 2004 Subject: Pre-release of IBM XML Security Suite Message-ID: <199904190402.NAA25520@ns.trl.ibm.com> IBM XML Security Suite is available. http://204.146.176.227/aw.nsf/html/XML_Security_Suite (This URL is temporary.) The package contains the following: - Reference implementations of DOMHash [1], both for 100%-pure-DOM API and SAX API, and some utility classes for DOM. - Sample digital signature implementation based on Richard Brown's Internet Draft [2]. - Parser Test Tool for testing your DOM or SAX implementation. The package requires JDK 1.1 and Swing 1.1 and an XML parser. The sample dsig implementation requires JDK 1.2. Please take a look and give us comments if any. [1] http://www.ietf.org/internet-drafts/draft-hiroshi-dom-hash-01.txt [2] http://www.ietf.org/internet-drafts/draft-brown-xml-dsig-00.txt -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nevansmu at fdgroup.co.uk Mon Apr 19 11:02:09 1999 From: nevansmu at fdgroup.co.uk (Neil Evans-Mudie) Date: Mon Jun 7 17:11:22 2004 Subject: help populating MS DOM using visual basic 6 Message-ID: <01BE8A4B.D11754A0.nevansmu@fdgroup.co.uk> subject: help populating MS DOM using visual basic 6 Hi, i'm new to the xml development scene and have been trying to get up to speed. I develop using VB6 and wish to populate MSXML's DOM using an ADO2 recordset (rs). I can then just call the DOMDocument.XML method to create my tagged up XML string - perfect. Unfortunately i can't work out how to populate the DOM tree. The code following creates the rs and then writes to textbox control my vision of what the XML should look like (also follows). [MDB file supplied as attachment, place in same folder as vb form!] can anybody please advise me how to fill the DOM tree? Many thanks in advance for any suggestions given. Thanks also for the discussions on the list - they're excellent for newbies, like me, learning about new technologies like XML (i know XMLs not new, but it mass implememntation seems to be just becomming fashionable). Best wishes Neil (signature below). XML Vision ~~~~~~~~ Schools: Records = 2 | +--- School | | | +--- ID: 1 | | | +--- Name: Ballerkermeen High School | +--- School | +--- ID: 2 | +--- Name: St. Ninians High School Code ==== Private Sub Form_Load() Dim cnn As ADODB.Connection, rs As ADODB.Recordset Dim dmdoc As New msxml.DOMDocument Dim nIndex As Integer, nTabNum As Integer Dim sSQL As String, sChara As String, s As String On Error GoTo Err_Hand_Control Set cnn = New ADODB.Connection: Set rs = New ADODB.Recordset cnn.Open "DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=" & _ App.Path & "\Help.mdb" sSQL = "Select [ID], [Name] from [Schools] order by [Name]" '# open the recordset.. rs.Open sSQL, cnn, adOpenStatic, adLockReadOnly, adCmdText '# insert code to populate doc. object model then send to textbox in UI '# txtXML.Text = dmdoc.xml '# set textbox to xml of DOM.XML contents '# this code snippet produces txt description of rs as a tree s = "Schools: Records = " & rs.RecordCount & vbCrLf nTabNum = 1 Do While Not rs.EOF '# whilst recs to process.. rs.MoveNext sChara = IIf(rs.EOF, " ", "|") rs.MovePrevious s = s & String(nTabNum, vbTab) & "|" & vbCrLf s = s & String(nTabNum, vbTab) & "+--- School " & vbCrLf For nIndex = 0 To rs.Fields.Count - 1 s = s & String(nTabNum, vbTab) & sChara & vbTab & "|" & vbCrLf s = s & String(nTabNum, vbTab) & sChara & vbTab & "+--- " & _ rs.Fields(nIndex).Name & ": " & rs.Fields(nIndex).Value & vbTab & vbCrLf Next nIndex rs.MoveNext '# next record Loop Text1.Font = "Courier" '# set non-proportional font Text1.Text = s '# update UI with fake DOM tree '# CLose ADO OBJs.. rs.Close: Set rs = Nothing cnn.Close: Set cnn = Nothing DoEvents '# yield execution Exit Sub '# avoid error handler Err_Hand_Control: With Err MsgBox .Number & ": " & .Description, vbOKOnly, "Unexpected error.. closing app. " End With End '# close app End Sub ======= Neil Evans-Mudie Developer, fretwell-downing group Ltd. (nevansmu@fdgroup.co.uk; http://www.fdgroup.co.uk/) [Snail-Mail: fretwell-downing group Ltd, Brincliffe House, 861 Ecclesall Road, Sheffield, S11 7AE, England. Tel: +44 (0)114 281 6000; Fax: +44 (0)114 281 6001] -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/octet-stream Size: 65536 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990419/475e4fe6/attachment.obj From Matthew.Sergeant at eml.ericsson.se Mon Apr 19 13:47:58 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:22 2004 Subject: Apache XML Mime reader Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A17C7@eukbant101.ericsson.se> After discussions earlier on this list about how it would be useful to have an XML Mime sniffer for Apache, I went ahead and coded one. Unfortunately it uses mod_perl, so it's not going to be the best solution for everyone, but it works for me :-) It allows you to return any mime type you like, you can even configure that on a per-directory basis, and it defaults to application/xml. It correctly detects utf-16. I've tested it on a number of different encoded XML files, although obviously not every XML file going. There are bound to be bugs, but this is a first stab at getting something that works. Anyway, it should be on CPAN rsn, for the impatient you can get it at http://src.doc.ic.ac.uk/computing/programming/languages/perl/CPAN/authors/id /M/MS/MSERGEANT/Apache-MimeXML-0.01.tar.gz Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lucio.piccoli at one2one.co.uk Mon Apr 19 14:22:19 1999 From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI) Date: Mon Jun 7 17:11:22 2004 Subject: XML.com Message-ID: <36029822.190399@smtpgate1.ONE2ONE.CO.UK> What happen to xml.com? adios -lucio --------------------------------------------------------------------- One2One LUCIO.PICCOLI@one2one.co.uk Elstree Tower tel : +44 181 214 3847 Elstree Way Borehamwood fax :+44 181 214 2325 LONDON WD6 1DT __________ http://www.one2one.co.uk _____________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ClarkB at scmb.co.za Mon Apr 19 14:34:54 1999 From: ClarkB at scmb.co.za (Clark, Bruce B) Date: Mon Jun 7 17:11:22 2004 Subject: pre newbie question Message-ID: Hi Group, It is humiliating to ask such a basic here, but everything that I have read presumes more knowledge than I have got. I am learning XML from a book and trying to get the dreaded "hello world" into my browser. The book says that I must make a XML document (done that), make a style sheet (done that) and then convert it to HTML. This is where I am stuck. Why must I do this ? The book also says that I must ftp a file down called MSXSL.EXE and it will do the required conversion. I have looked for the file at the recommended site and it does not exist. I have done a search and can't find it either. I obviously have zero knowledge about XML. If anyone can point me in the direction of a resource, I would be most grateful. And does anybody know where I can get the MSXSL.EXE from, or is there another alternative ? Thanks, Bruce. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jabuss at cessna.textron.com Mon Apr 19 15:10:56 1999 From: jabuss at cessna.textron.com (Buss, Jason A) Date: Mon Jun 7 17:11:22 2004 Subject: pre newbie question Message-ID: Hi Bruce. As far as I know, The release of MSIE5 should have this file already. The reason that you are asked to transform the XML to HTML is because the only way (that I know of) to display XML natively in IE5 is using CSS, and there have been some issues regarding the support of CSS in IE5. They have put a great deal of effort into using XSL with IE5 (for some reason). So you will probably be safer using XSL to transform XML to HTML as opposed to using XML with CSS. It's just one of those things. The browsers are in the early stages of adopting XML so there will be some working around and what have you until some of the working drafts really become recommendations, and the browser technology catches up with the standards. Look for Mozilla in about 6 months. There are a great number of anticipated XML features promised for the next release. good luck, Jason A. Buss Single Engine Technical Publications Cessna Aircraft Co. jabuss@cessna.textron.com > Clark, Bruce B > Hi Group, > > It is humiliating to ask such a basic here, but everything that I have > read > presumes more knowledge than I have got. I am learning XML from a book > and > trying to get the dreaded "hello world" into my browser. The book says > that > I must make a XML document (done that), make a style sheet (done that) and > then convert it to HTML. This is where I am stuck. Why must I do this ? > The > book also says that I must ftp a file down called MSXSL.EXE and it will do > the required conversion. I have looked for the file at the recommended > site > and it does not exist. I have done a search and can't find it either. > > I obviously have zero knowledge about XML. If anyone can point me in the > direction of a resource, I would be most grateful. And does anybody know > where I can get the MSXSL.EXE from, or is there another alternative ? > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Apr 19 15:48:17 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:11:23 2004 Subject: MSXSL in XML: Extensible Markup Language (was Re: pre newbie question) In-Reply-To: Message-ID: At 2:33 PM +0200 4/19/99, Clark, Bruce B wrote: >Hi Group, > >It is humiliating to ask such a basic here, but everything that I have read >presumes more knowledge than I have got. I am learning XML from a book and >trying to get the dreaded "hello world" into my browser. The book says that >I must make a XML document (done that), make a style sheet (done that) and >then convert it to HTML. This is where I am stuck. Why must I do this ? The >book also says that I must ftp a file down called MSXSL.EXE and it will do >the required conversion. I have looked for the file at the recommended site >and it does not exist. I have done a search and can't find it either. > >I obviously have zero knowledge about XML. If anyone can point me in the >direction of a resource, I would be most grateful. And does anybody know >where I can get the MSXSL.EXE from, or is there another alternative ? > > I'll take a wild guess that that's my book. I've been puzzling over what exactly to do about Microsoft's dropping of MSXSL for some time, and I'm afraid I don't have a perfect answer for anyone. Such is life on the bleeding edge that is still XSL. The bottom line is this: the draft XSL specification has changed twice since XML: Extensible Markup Language was first published, once radically, once not quite as radically. In any case, the version of XSL described in my book and partially implemented by MSXSL is no longer recommended. Furthermore, Microsoft has pulled MSXSL from their Web site, and it is no longer available. I tried sending the file to a few people, but none of them seemed able to successfully unzip and install it on their systems. Something in between the last two drafts of XSL is now supported by Internet Explorer 5.0. I'm working with this now, and recommend you do the same. There are also a number of other tools for working with XSL including LotusXSL http://www.alphaWorks.ibm.com/formula/LotusXSL xslj ftp://ftp.cogsci.ed.ac.uk/pub/XSLJ/ Koala XSL Engine http://www.inria.fr/koala/XML/xslProcessor FOP http://www.jtauber.com/fop/ None of these do exactly what MSXSL did, but some come close. I've had relatively good luck with LotusXSL in particular. All four vary in exactly which parts of which working drafts they support. I have not yet worked with the other three heavily myself, so I don't have any strong recommendations. I'd be curious to hear your opinions of them if you try them out. However, you must keep in mind that XSL is still a bleeding edge technology. Anything you do now will be invalidated in six months (if not sooner) so you need to be ready for this. None of the available tools fully support the current working draft, and even that draft is almost certain to change. XSL is slowly beginning to gel, but it will not be complete until Summer 1999 at the earliest (probably later). Obviously an update of the book is called for. I am currently working on a much revised and expanded version which will be published by IDG this summer as The XML Bible. I have repeatedly asked IDG to allow me to post chapters from that book on my Web site with updated information as I write them. So far they have been unresponsive. I have begun posting examples from that book online at http://metalab.unc.edu/xml/books/bible/examples/ The examples from Chapter 5 should be particularly helpful with regard to basic XSL. I wish I had a better answer for you, but as yet I do not. If one arises, I will let you know. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eamon at lendac.ie Mon Apr 19 16:05:20 1999 From: eamon at lendac.ie (Eamon Hayes) Date: Mon Jun 7 17:11:23 2004 Subject: pre newbie question Message-ID: <01BED2BE.3B707EC0@EAMON> Bruce I am a bit of a newbie myself but I have been lurking. Your inability to find the MSXSL application is because it was a early Microsoft implementation demonstrating the use of XSL. Some time afterwards the XSL draft changed and the application was removed from the Microsoft site. As XSL is still evolving your book probably deals with an earlier implementation. Regards ?amon -----Original Message----- From: Clark, Bruce B [SMTP:ClarkB@scmb.co.za] Sent: 19 April 1999 13:33 To: 'xml-dev@ic.ac.uk' Subject: pre newbie question xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Mon Apr 19 16:21:54 1999 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:11:23 2004 Subject: ANNOUNCE: New XML Book Message-ID: <199904191404.QAA08379@goofy.gr05.synopsys.com> Esteemed colleagues, It is with some pride and no little pleasure that I am pleased to announce that Macmillan has started to ship my latest book: Teach Yourself XML in 21 Days Sams.Net (a Macmillan company) Author: Simon North & Paul Hermans ISBN: 1-57521-396-6 This is meant to be an entry-level book accessible to readers who are vaguely familiar with HTML. Contains lots of practical examples of how to use the available tools and technologies (Perl, Omnimark, DOM, MSDSO, DSSSL, CSS and XSL) to display XML in Internet Explorer 5 and Netscape 5 (Gecko, NLG). Most of the online bookstores have the wrong details (wrong author, wrong availability date), but there is a long story behind that (as some other list correspondents can confirm). Simon "Presenting XML", "Teach Yourself XML in 21 Days". -----BEGIN GEEK CODE BLOCK----- GIT/GL/GCS/GTW d? s+:+ a+ C++ US++++ P++ L+ E- W+++ N+++ O K++ w++++ !O M V PS+ PE Y+ PGP++ t+@ 5-- X- R- tv b+++ DI+ D G e++ h---- r+++ z++++ ------END GEEK CODE BLOCK------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eriblair at mediom.qc.ca Mon Apr 19 17:47:40 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:11:23 2004 Subject: How to handle a selectSingleNode method error ... Message-ID: <01b201be8a7c$2c1c69f0$1f9ccb84@grr.ulaval.ca> How to handle a selectSingleNode method error ... Hi everybody, I utilize the XSL server-side (...within ASP) to handle some XML files and find a specific pattern with the selectSingleNode method to display all the informations about the item... but my problem is ... when the node of a part of the informations is null ... I got an error that ... is not an object ... The thing I want to do is to load correctly the XSL file to display the other parts of the informations and replace the error in the same HTML, by an standard message like: "this part is under construction"... Thanks in advance for any answer... Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990419/a0e49e69/attachment.htm From bd at internet-etc.com Mon Apr 19 18:11:09 1999 From: bd at internet-etc.com (Brandt Dainow) Date: Mon Jun 7 17:11:23 2004 Subject: pre newbie question In-Reply-To: Message-ID: <001101be8a79$7c185d40$0a90bc3e@p300> MSXML is the XML parser/processor from Microsoft. You can run it in IE4, but it's built-in in IE5. You then have to pass the XML to it. To get visible output you need to link it to HTML, hence the stylesheet, since XML has no inbuilt appearance properties (that's the point in XML, right?). If you want to get started at a simple level & have IE5, I've built a simple on-line parser at our website, which is good for a starting play with XML (http://www.internet-etc.com/wellformed.htm). It uses JavaScript, and may give you some idea on linking XML and Javascript. However, code samples are easier to find at http://msdn.microsoft.com/xml/ Brandt Dainow bd@internet-etc.com Internet Etc Ltd http://www.internet-etc.com >-----Original Message----- >From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On >Behalf Of >Clark, Bruce B >Sent: Monday, April 19, 1999 1:33 PM >To: 'xml-dev@ic.ac.uk' >Subject: pre newbie question > > >Hi Group, > >It is humiliating to ask such a basic here, but everything >that I have read >presumes more knowledge than I have got. I am learning XML >from a book and >trying to get the dreaded "hello world" into my browser. The >book says that >I must make a XML document (done that), make a style sheet >(done that) and >then convert it to HTML. This is where I am stuck. Why must I >do this ? The >book also says that I must ftp a file down called MSXSL.EXE >and it will do >the required conversion. I have looked for the file at the >recommended site >and it does not exist. I have done a search and can't find it either. > >I obviously have zero knowledge about XML. If anyone can point >me in the >direction of a resource, I would be most grateful. And does >anybody know >where I can get the MSXSL.EXE from, or is there another alternative ? > > >Thanks, > >Bruce. > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From milowski at dnai.com Mon Apr 19 18:53:01 1999 From: milowski at dnai.com (Alex Milowski) Date: Mon Jun 7 17:11:23 2004 Subject: XML Parser in Unix C++ (perhaps DEC UNIX?) In-Reply-To: <8E7905420FB9D211916A00609773FB0E01616AB2@exchange1.echostar.com> Message-ID: On Tue, 13 Apr 1999, Casey, Mark wrote: > Here's my C++ XML Parser list so far: > > 1. SP (James Clark) > 2. AntLr (SGML parser that needs a C++ def to output to C++ instead of it's > usual Java) Antlr is not an SGML parser. It is an LL(k) parser generator toolkit that you could use to build an XML parser. It is a very nice tool all thanks to Terrence Parr et. al. R. Alexander Milowski milowski@dnai.com Remember: Stressed spelled backwards is desserts. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Apr 19 20:55:47 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:23 2004 Subject: XML Performance question Message-ID: <87256758.0067D35F.00@d53mta03h.boulder.ibm.com> >Following the XML for the last couple month, I am surprised how little >attention is paid to performance. My=A0 optimistic personality leads me = >to he conclusion that performance is not an issue. > Sorry for the late response, but I've been on vacation (at least I think that's what that was...) Actually, IMHO, performance is critical, and its really hard to make a fully conforming parser that is flexible, maintainable, and fast. Partly its because some of the kind of arcane XML rules were not written with performance in mind particularly. I think that performance matters as much on the small end of the spectrum as the large as well. Yes, you want to be able to do a 32MB file in a few seconds. But you also have to consider the e-bidness stuff, in which you might have to create a parser (since its part of a stateless server thingy and can't just hang around), parse a small transaction file, and clean up that parser (and whatever under/overlying infrastructure it required) very fast. If you see a parser that is way, way out ahead of the field on performance on a wide array of source input, you should probably be suspicious that it is either not really fully conforming to the spec or is extremely inflexible for future expansion (or both.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:24:53 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: Message-ID: <371B8152.1436AC6F@w3.org> Jelks Cabaniss wrote: > But how to do it? If XML 1.1 has a "valid='yes'|'no'" in the declaration, XML > 1.1 documents may break when running under an XML 1.0 parser, since the XML 1.0 > BNF clearly states what can and can't be in the declaration. But they would also notice the version="1.1" as well, and either stop or ignore stuff they didn't understand... > Maybe a PI could be formalized similar to the way the stylesheet linking is > being done: > > > > (could also be "minimal" or "full" for the well-formed only options you > mentioned). I think the chance of doing this extension via a PI is minimal. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:24:59 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <001b01be8536$66df9710$0100007f@eps.inso.com> Message-ID: <371B84B5.C39EA23C@w3.org> Gavin Thomas Nicol wrote: > > But if you are transcoding, you have to fix it anyway - so? > > Right, but > > a) You have to fix it by parsing a peice of arbitrary syntax, which > proxies etc. will most likely not do, for performance reasons. Now in a different message you were saying that cacheing the results of parsing the encoding declaration was not worth it because the effore required to re-parse it each time was minimal. So I donm't see how you can now have it be a performance hit. > b) The XML declaration is part of the *document* as specified by > the XML 1.0 recommendation, changing the XML declaration changes > the *document*, which is a Bad Thing(tm). Well in theory yes, but in practice the advantages seem to me to outweigh the disadvantages. If someone cares enough about an XML document that they think a changed encoding declaration has destroyed its value (eg, a digitally signed transaction encoded in XML) then they don't want any dumb - or even smart - proxies merrily changing from UTF-8 to 8859-2 or whatever either. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:26:14 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: Message-ID: <371B8152.1436AC6F@w3.org> Jelks Cabaniss wrote: > But how to do it? If XML 1.1 has a "valid='yes'|'no'" in the declaration, XML > 1.1 documents may break when running under an XML 1.0 parser, since the XML 1.0 > BNF clearly states what can and can't be in the declaration. But they would also notice the version="1.1" as well, and either stop or ignore stuff they didn't understand... > Maybe a PI could be formalized similar to the way the stylesheetX-Mozilla-Status: 0009one: > > > > (could also be "minimal" or "full" for the well-formed only options you > mentioned). I think the chance of doing this extension via a PI is minimal. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:26:14 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <001b01be8536$66df9710$0100007f@eps.inso.com> Message-ID: <371B84B5.C39EA23C@w3.org> Gavin Thomas Nicol wrote: > > But if you are transcoding, you have to fix it anyway - so? > > Right, but > > a) You have to fix it by parsing a peice of arbitrary syntax, which > proxies etc. will most likely not do, for performance reasons. Now in a different message you were saying that cacheing the results of parsing the encoding declaration was not worth it because the effore required to re-parse it each time was minimal. So I donm't see how you can now have it be a performance hit. > b) The XML declaration is part of the *document* as specified by > the XML 1.0 recommendation, changing the XML declaration changes > the *document*, which is a Bad Thing(tm). Well in theory yes, but in practice the advantages seem to me to outweigh the disadvantages. If someone cares enough about an XML document that they think a changed encoding declaration has destroyed its value (eg, a digitally signed transaction encoded in XML) then they don't want any dumb - or even smart - proxies merrily changing from UTF-8 to 8859-2 or whatever either. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:28:01 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) References: Message-ID: <371B90F9.A2EA0E97@w3.org> Jelks Cabaniss wrote: > But how to do it? If XML 1.1 has a "valid='yes'|'no'" in the declaration, XML > 1.1 documents may break when running under an XML 1.0 parser, since the XML 1.0 > BNF clearly states what can and can't be in the declaration. But they would also notice the version="1.1" as well, and either stop or ignore stuff they didn't understand... > Maybe a PI could be formalized similar to the way the stylesheetX-MozillaX-Mozilla-Status: 8018 > > > (could also be "minimal" or "full" for the well-formed only options you > mentioned). I think the chance of doing this extension via a PI is minimal. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 22:28:09 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:23 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) References: <001b01be8536$66df9710$0100007f@eps.inso.com> Message-ID: <371B9109.297E68D@w3.org> Gavin Thomas Nicol wrote: > > But if you are transcoding, you have to fix it anyway - so? > > Right, but > > a) You have to fix it by parsing a peice of arbitrary syntax, which > proxies etc. will most likely not do, for performance reasons. Now in a different message you were saying that cacheing the results of parsing the encoding declaration was not worth it because the effore required to re-parse it each time was minimal. So I donm't see how you can now have it be a performance hit. > b) The XML declaration is part of the *document* as specified by > the XML 1.0 recommendation, changing the XML declaration changes > the *document*, which is a Bad Thing(tm). Well in theory yes, but in practice the advantages seem to me to outweigh the disadvantages. If someone cares enough about an XML document that they think a changed encoding declaration has destroyed its value (eg, a digitally signed transaction encoded in XML) then they don't want any dumb - or even smart - proxies merrily changing from UTF-8 to 8859-2 or whatever either. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Mon Apr 19 22:34:59 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:11:23 2004 Subject: almostXlink reference? Message-ID: <77A952A6B467D211855D00805F9521F114940E@cnet10.cnet.com> Greetings, A few days ago, someone posted a message about an Xlink simulator written using MSIE5. As I recall, the author posted to this group about it, including a reference, but I am unable to locate it. I thought the author was Guy Murphy (http://www.guy-murphy.easynet.co.uk/) but I wasn't able to find any references to this at his site, so I suppose I was mistaken. Robin Cover's page does not list a reference. Does anyone have any info on this, or was the whole thing a product of my fevered imagination and my wishful thinking regarding Xlink? Regards, D- ********************************************************************** Daniel Austin, Director of Development, Creative Services, CNET daniela@cnet.com 415-395-7800 x1438 "To change the old into the new, and the shapes of things to come..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pwilson at gorge.net Mon Apr 19 22:44:28 1999 From: pwilson at gorge.net (Peter Wilson) Date: Mon Jun 7 17:11:23 2004 Subject: Use of Tags Message-ID: <371B9572.756D6FDB@GORGE.NET> I see two competing ways of designing an xml document type: 1. Extensive use of attributes with elements used for structured and repeating items.

2. Extensive use of elements and almost no use of attributes: 12345 19991231
... ...

... ... 2 10.99 ...
What motivates these two choices? Apart from the fact that the second examples occupies 50% more storage what is to choose between these representations? What philosophy should one use to make design decisions? If you attempt to associate data types with entity data which is the better representation? I.e. is it easier to define that the element has a PCDATA content which is a currency value or that the Price attribute has a currency value? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Apr 19 23:28:30 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:23 2004 Subject: Perl XML::Writer Module Message-ID: <14107.40817.439520.897106@localhost.localdomain> [NOTE: I already announced this on the perl-xml mailing list, but I realised that I should do so here as well.] I needed this, so I wrote it this morning: http://www.megginson.com/Software/XML-Writer-0.1.tar.gz (I've also sent an upload to CPAN, but it might take a while to appear). Here's a very simple synopsis: use XML::Writer; use IO; my $output = new IO::File(">output.xml"); my $writer = new XML::Writer($output); $writer->startTag("greeting", "class" => "simple"); $writer->characters("Hello, world!"); $writer->endTag("greeting"); $writer->end(); $output->close(); (You can also leave out the $output argument if you want to go straight to STDOUT.) By default, the module does a fair bit of well-formedness checking to help you catch bugs in your Perl programs -- you can turn the checking off for production use if you like to live on the wild side. Here are the errors that the module catches so far: - Lack of a (top-level) document element, or multiple document elements. - Unclosed start tags. - Misplaced delimiters in the contents of processing instructions or comments. - Misplaced or duplicate XML declaration(s). - Misplaced or duplicate DOCTYPE declaration(s). - Mismatch between the document type name in the DOCTYPE declaration and the name of the document element. - Mismatched start and end tags. - Attempts to insert character data outside the document element. - Duplicate attributes with the same name. Full POD documentation and a lot of test cases in test.pl are included. Enjoy! David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Mon Apr 19 23:41:47 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:24 2004 Subject: pre newbie question References: Message-ID: <371BA28C.89CAF591@w3.org> "Buss, Jason A" wrote: > > Hi Bruce. > > As far as I know, The release of MSIE5 should have this file already. The > reason that you are asked to transform the XML to HTML is because the only > way (that I know of) to display XML natively in IE5 is using CSS, and there > have been some issues regarding the support of CSS in IE5. What "issues", compared to (for example) the issues of using XSL FOs (not existent in IE5)? Displaying XML by converting it to HTML and then displaying that in the browser is exposing exactly the same issues, because the same CSS-based rendering engine is being used in both cases. Except that with XML, you know what the parse tree is and can construct your selectors accordingly; wheras with HTML, you have little idea what the parse tree is. And, of course, with HTML there are many more "gotchas" in the form of hard-coded renderings of particular elements. > They have put a > great deal of effort into using XSL with IE5 (for some reason). So you will > probably be safer using XSL to transform XML to HTML as opposed to using XML > with CSS. It's just one of those things. It would be good to see this argument backed up with examples. > The browsers are in the early > stages of adopting XML so there will be some working around and what have > you until some of the working drafts really become recommendations, Um, XML 1.0 is a recommendation. So is CSS2. > and the > browser technology catches up with the standards. Well, that part is certainly true > Look for Mozilla in about > 6 months. There are a great number of anticipated XML features promised for > the next release. Right, such as using CSS2 with XML. You can see some of that now, with the DocZilla browser, which uses Citec's XML and SGML parsing together with Mozillas NGLayout engine. To answer the original question: a) construct your XML document, hello.xml Hello World! b) construct your stylesheet, basic.css foo { display: block; margin: 10%; background: white; color: black; font: 18pt/20pt Arial, Helvetica, sans-serif } c) link the stylesheet to the XML document Hello World! d) display in browser. Forget about coding for specific browsers, particular .exe or dll files, and so on. Distinguish between the XML specification, on the one hand, and particular implementations, on the other. XML is a specification, not a product. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Apr 19 23:50:53 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:24 2004 Subject: XML Torture Test: Parsers Fail Message-ID: <87256758.0077E280.00@d53mta03h.boulder.ibm.com> >> I don't see anything in the spec that says "don't read and validate >> external parsed entities if they're not used." And in fact, the spec >> seems to say that, in order to be valid, they must (whether used or not) >> match certain productions in the grammar. > >Surely the question a validating parser is supposed to answer is not >whether the external parsed entities are valid, but whether the >*document* is valid. > I agree with the latter myself. Once again, sorry for the late contributions, I just got back from my so called vacation (in which the weather totally sucked until I came back :-) I could see the utility in having a specific flag on your implementation to force such 'heavy' validation, but I would argue against being forced to scan it if its not even referenced in all cases. It would be a nice tool for DTD developers to use before putting out the DTD. But I probably wouldn't want my file, which only used 10% of some heavy duty DTD/Schema, to have to pay the cost to fully validate the DTD/Schema every time I used it, just to make sure for the one millionth time today that it was fully correct. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Apr 20 00:27:43 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:24 2004 Subject: SUMMARY: XML Validation Issues (was: several threads) Message-ID: <87256758.007B3E87.00@d53mta03h.boulder.ibm.com> >> >> Chris Lilley wrote: >> >> > I don't sense consensus yet on whether client-side validation is always >> > desirable; it clearly is in some cases and clearly adds little in other >> > cases. >> >> Wouldn't it depend on what the client is? > >Yes. Which is why I wrote that I don't sense concensus on this - there >are arguments both for and against; for rewuireing validation, for never >requiring it, etc. > In our second generation parsers, we take the approach that what 'parser' you use indicates what you want to do. For us, a 'parser' is just a little glue that wires together some set of the underlying functionality to some output API. We have validating SAX and DOM parsers, and non-validating SAX and DOM parsers. If you use a non-validating parser, then it will not validate, period. If you use a validating parser it will validate (and require a DTD.) That scheme seems pretty sane to me. Validation should be something that is requested, not magically invoked according to the content of the DTD, IMHO. I would hate for any other rule to come into play, because it would lessen the flexibility available to the user (whether human or machine.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Apr 20 02:55:04 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:11:24 2004 Subject: Use of Tags Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF367@RED-MSG-08> PWilson asked when to use elements; when to use attributes. You'll find a discussion at http://www.oasis-open.org/cover/elementsAndAttrs.html . xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dent at oofile.com.au Tue Apr 20 03:44:32 1999 From: dent at oofile.com.au (Andy Dent) Date: Mon Jun 7 17:11:24 2004 Subject: Use of Tags In-Reply-To: <371B9572.756D6FDB@GORGE.NET> Message-ID: At 4:43 +0800 20/4/99, Peter Wilson wrote: >I see two competing ways of designing an xml document type: > >1. Extensive use of attributes with elements used for structured and >repeating items. > ... > >2. Extensive use of elements and almost no use of attributes: > > 12345 > 19991231 ... > >If you attempt to associate data types with entity data which is the >better representation? I.e. is it easier to define that the >element has a PCDATA content which is a currency value or that the Price >attribute has a currency value? The path that Microsoft seem to be following with XML-Data is to use elements and describe the schema in XML. My single biggest problem with this is the reuse of elements within other elements - you can't define an element with local 'scope'. What happens when Amount is an i2 in one context and a float in another? For this reason, people modelling UML mapping to serialisation in XML have recommended use of attributes (your 1.) for the properties of an object. The discussion page http://www.oasis-open.org/cover/elementsAndAttrs.html has a great link on this topic. I recommend reading all the links for a balanced, and evolving viewpoint. Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows PP2MFC - PowerPlant->MFC portability http://www.oofile.com.au/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Tue Apr 20 13:13:10 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:24 2004 Subject: DOM - Creating Documents Message-ID: <002801be8b1e$d1ace000$ab20268a@pc-lrd.bath.ac.uk> Hi, just a quick question: Why doesn't the Document Object Model have a createDocument method to allow the creation of a new Document instance? I'd like to be able to build up a Document programmatically and later save it to disk as an XML document. I can see that the details of how a Document might be saved are implementation specific, but surely the actual *creation* of a Document is abstract enough to be included? Is this due for later amendment, or am I missing something subtle here? Presently I can see that the only way I can achieve this is to use the OpenXML package which has a DOMFactory object for creating a new Document. In other words I'm using an implementation specific feature when I was hoping to write my application around the DOM spec and then plug in any suitable DOM-compliant parser. It seems the spec is lacking in this regard. Thanks in advance. And also apologies if this has been discussed to death in the past. L. ================================================================== "Never Do With More, What Can Be Achieved With Less" ---William of Occam ================================================================== Leigh Dodds Eml: ldodds@ingenta.com ingenta ltd Tel: +44 1225 826619 BUCS Building, University of Bath Fax: +44 1225 826283 ================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eric at w3.org Tue Apr 20 13:47:30 1999 From: eric at w3.org (Eric Prud'hommeaux) Date: Mon Jun 7 17:11:24 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available In-Reply-To: <14107.56295.420912.593882@localhost.localdomain>; from David Megginson on Mon, Apr 19, 1999 at 10:02:56PM -0400 References: <14107.31778.186914.865745@localhost.localdomain> <19990419182626.A9287@dhcppc1.earthlink.net> <14107.53582.812560.679399@localhost.localdomain> <19990419213508.B9287@dhcppc1.earthlink.net> <14107.56295.420912.593882@localhost.localdomain> Message-ID: <19990420074704.A15705@w3.org> On Mon, Apr 19, 1999 at 10:02:56PM -0400, David Megginson wrote: > XSL provides one good and very powerful model for doing XML > transformations, but that model itself requires that an entire > document be held in random-access storage of some sort (say, memory, > or a database) during processing, and that's inappropriate for the > very large subset of XML work that is both speed- and memory-critical. I'd love to differ with you here. In practice, I can't, but in theory... I have this itch to work out and implemnt an XSL parser that works as as a SAX stream. Given an XslStream that reads the parsed stylesheet from an XslDB and has an output SAX stream $this->{OUTPUT}, the notion is something like this: parser reads "" calls W3C::SAX::XslStream::startElement XslStream checks is XslDB for a list of all rules that could apply to 'someTag' and finds only a single template: This tells it that there is no ambiguity or ordering so it can call $this->{OUTPUT}->startElement('innerTag', new AttributeList) That's the ideal case, but XSL accomodates many situations where it's not that easy. Since XSL defines sorting and sequencing, it is possible to arrive in at a rule that cannot be immediately sent to the output stream, like: In this case, the stream can dump output the tagA immediately, and stick the tagB in an event queue (or maybe just a grove) to be flushed when W3C::SAX::XslStream::endElement('someTag') is called. If the XML document contains long series of atoms that the stylesheet says do not need to be ordered, the transformed document can be generated as the document is parsed. For instance: DOCUMENT: Ne s 1 2 s 2 2 p 2 6 ... XSL STYLESHEET: The Atoms... The sort on the positions causes the XslStream to buffer the output for each position until the /position is hit, but it still gets to flush the atoms basicly as fast as they come in. I see lots of folks talking about using XML for moving vast streams of business process data. I beleive this sort of mechanism will make all that pheasible without have to write custom translation engines for this data. I have sketched out the players in perl (XslParser, XslDB, XslStream) but haven't started the real work of coding the transformations and when it can flush and when it can't. Anybody out there interested in taking this over? -- -eric (eric@w3.org) PS. If the chemistry-looking stuff above is wrong, it's because I'm not now, nor ever intended to be, a physical chemist. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Apr 20 13:59:12 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:24 2004 Subject: Apache XML Mime sniffer Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A17CF@eukbant101.ericsson.se> This didn't appear to hit the list last time I sent it... Melissa syndrome on our mail server... == After discussions earlier on this list about how it would be useful to have an XML Mime sniffer for Apache, I went ahead and coded one. Unfortunately it uses mod_perl, so it's not going to be the best solution for everyone, but it works for me :-) It allows you to return any mime type you like, you can even configure that on a per-directory basis, and it defaults to application/xml. It correctly detects utf-16. I've tested it on a number of different encoded XML files, although obviously not every XML file going. There are bound to be bugs, but this is a first stab at getting something that works. Anyway, it should be on CPAN rsn, for the impatient you can get it at http://src.doc.ic.ac.uk/computing/programming/languages/perl/CPAN/authors/id /M/MS/MSERGEANT/Apache-MimeXML-0.01.tar.gz Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 20 14:37:55 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:24 2004 Subject: DOM - Creating Documents In-Reply-To: <002801be8b1e$d1ace000$ab20268a@pc-lrd.bath.ac.uk> References: <002801be8b1e$d1ace000$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <14108.29751.86459.191626@localhost.localdomain> Leigh Dodds writes: > Why doesn't the Document Object Model have a createDocument > method to allow the creation of a new Document instance? It's hard to guess what the original motivation was, and I have never been a member of the DOM WG (and joined the IG fairly recently), but I do think that it makes some sense: after all, the DOM will often be an adapter interface to an entirely different structure, like a set of database tables, and in such a contexts, createDocument might not make sense at all. Now, that said, it would be possible to have createDocument simply throw an exception when it's inappropriate. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 20 14:49:44 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:24 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available In-Reply-To: <19990420074704.A15705@w3.org> References: <14107.31778.186914.865745@localhost.localdomain> <19990419182626.A9287@dhcppc1.earthlink.net> <14107.53582.812560.679399@localhost.localdomain> <19990419213508.B9287@dhcppc1.earthlink.net> <14107.56295.420912.593882@localhost.localdomain> <19990420074704.A15705@w3.org> Message-ID: <14108.29970.711433.501336@localhost.localdomain> Eric Prud'hommeaux writes: > I'd love to differ with you here. In practice, I can't, but in > theory... I have this itch to work out and implemnt an XSL parser that > works as as a SAX stream. Given an XslStream that reads the parsed > stylesheet from an XslDB and has an output SAX stream $this->{OUTPUT}, > the notion is something like this: > > parser reads "" In DSSSL, such a thing was not possible because there were unpredicatable dependencies -- for example, you might find this near the front of the document: ... But you wouldn't know that you had to do something useful with it until you found this near the end of the document: ... In the general case, then, a stream-based DSSSL processor would *still* have to cache the entire document, since it allowed arbitrary navigation. I don't know if the same applies to XSL -- I'll have to give the spec a closer look. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kurt.donath at lmco.com Tue Apr 20 15:01:19 1999 From: kurt.donath at lmco.com (Kurt Donath) Date: Mon Jun 7 17:11:24 2004 Subject: WWW8 XML Activities? Message-ID: <371C7A46.4E06DB2C@lmco.com> Aside from some of the xml related tutorials and presentations, are there going to be any xml birds-of-a-feather sessions, or related activities at WWW8? -- Kurt Donath 315.456.6276 Staff Systems Engineer Intranet: http://www.syr.lmco.com/~donath/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lockheed Martin - Enterprise Information Systems Systems Engineering / Webserv xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Tue Apr 20 15:06:43 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:24 2004 Subject: DOM - Creating Documents In-Reply-To: <14108.29751.86459.191626@localhost.localdomain> Message-ID: <000601be8b2e$ae163500$ab20268a@pc-lrd.bath.ac.uk> > It's hard to guess what the original motivation was, and I have never > been a member of the DOM WG (and joined the IG fairly recently), but I > do think that it makes some sense: after all, the DOM will often be an > adapter interface to an entirely different structure, like a set of > database tables, and in such a contexts, createDocument might not make > sense at all. Now, that said, it would be possible to have > createDocument simply throw an exception when it's inappropriate. Granted, the actual implementation of the DOM tree may be very different to the representation as defined in the spec, but this still doesn't preclude the creation of a new Document object. After all I don't care *how* its created, I just want to deal with it in the same way as I would any other document. That representation wouldn't get passed to another parser it would all be handled internally. What I find puzzling is that I can createElement, createAttribute, etc, etc and then...what? Lose the changes? Basically if I can create the different node types without having to worry about implementation details, why can't the same be applied to the Document. If the implementation was actually database tables that shouldn't matter - I can create an Element, but not a Document - seems a huge whole to me - otherwise why would I bother creating elements anyway: they only persist for the period of my application. Useless if I want to save those changes, or build up a new structure from scratch. Yours Confused, L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Apr 20 16:37:48 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:24 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available References: <14107.31778.186914.865745@localhost.localdomain> <19990419182626.A9287@dhcppc1.earthlink.net> <14107.53582.812560.679399@localhost.localdomain> <19990419213508.B9287@dhcppc1.earthlink.net> <14107.56295.420912.593882@localhost.localdomain> <19990420074704.A15705@w3.org> Message-ID: <371C8627.B774AEC5@prescod.net> Eric Prud'hommeaux wrote: > > The sort on the positions causes the XslStream to buffer the output > for each position until the /position is hit, but it still gets to > flush the atoms basicly as fast as they come in. I don't understand what you mean here. How can it both buffer everything and flush everything? -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Apr 20 16:39:01 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:25 2004 Subject: DOM - Creating Documents References: <002801be8b1e$d1ace000$ab20268a@pc-lrd.bath.ac.uk> <14108.29751.86459.191626@localhost.localdomain> Message-ID: <371C86D9.87DBD1D3@prescod.net> David Megginson wrote: > > It's hard to guess what the original motivation was, and I have never > been a member of the DOM WG (and joined the IG fairly recently), but I > do think that it makes some sense: after all, the DOM will often be an > adapter interface to an entirely different structure, like a set of > database tables, and in such a contexts, createDocument might not make > sense at all. In the same situation createNode could cause a problem. Or setAttribute. Etc. I don't buy that argument. > Now, that said, it would be possible to have > createDocument simply throw an exception when it's inappropriate. That's the only reasonable way to handle writability. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SCampana at bluestone.com Tue Apr 20 17:56:11 1999 From: SCampana at bluestone.com (Campana, Sal) Date: Mon Jun 7 17:11:25 2004 Subject: Linking DTD's Message-ID: <9A4DF69E3C5ED211B86400A0C9D17760BB88C8@thor.operations.bluestone.com> Is it possible to nest DTD's? I would like to build a DTD from smaller DTD's...Is this possible? Can someone send me an example or a location which contains an example of this... I am also interested in the ways to include the DTD..i.e. SYSTEM, URL??, inline(of course).. Thanks, Sal -------------- next part -------------- A non-text attachment was scrubbed... Name: Sal Campana (E-mail).vcf Type: application/octet-stream Size: 303 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990420/29b120c3/SalCampanaE-mail.obj From costello at mitre.org Tue Apr 20 19:15:19 1999 From: costello at mitre.org (Roger L. Costello) Date: Mon Jun 7 17:11:25 2004 Subject: How to use binary data with XML? References: <01BED2BE.3B707EC0@EAMON> Message-ID: <371CB752.DA7CA866@mitre.org> I have a few questions on using binary data (e.g., gif, jpeg images) with XML. (1) Binary data isn't actually put in an XML file, correct? i.e., the binary data is _not_ inline, correct? An XML file just contains ASCII text, correct? (2) Just like in HTML, binary data is _referenced_ by the XML document, correct? (3) Is this the correct way of using binary data: DTD: XML: Here we see an XML document _referencing_ a file containg binary data. The binary data is not actually inline. Is this how it's done? /Roger xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From thanser at cybercable.tm.fr Tue Apr 20 19:19:11 1999 From: thanser at cybercable.tm.fr (Thierry Hanser) Date: Mon Jun 7 17:11:25 2004 Subject: DOM implementation Message-ID: "Bonjour" everyone, I am trying to move over to DOM and wonder which is the most proper way to do so. My main concern is about selecting a DOM implementation. Which one is the most reliable, comprehensive and maybe "official"? The problem I have to solve are mainly o migrating from a inhouse Tree model to a DOM compliant model o beeing able to reuse a JTree model that was designed for the former inhouse Tree class Of course I would like to remain independant from the DOM implementation and rely *only* on the DOM interface. As I have seen some implementations already extends the DOM core interface which I think is a pitfall if you don't pay attention. Thank for any help or advise ! Thierry xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Apr 20 19:20:14 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:25 2004 Subject: DOM - Creating Documents Message-ID: <93CB64052F94D211BC5D0010A80013310EB414@WWMESS3.172.19.125.2> > Why doesn't the Document Object Model have a createDocument > method to allow the creation of a new Document instance? > > I'd like to be able to build up a Document programmatically > and later save it to disk as an XML document. I can see that > the details of how a Document might be saved are implementation > specific, but surely the actual *creation* of a Document is > abstract enough to be included? The SAXON package includes an interface to construct a Document from an InputSource (defined as in SAX), and has drivers (implementations of this interface) for a number of DOM products including SUN, IBM, Docuverse, Oracle, and Datachannel. Yes it would be nice if there were a standard interface, but in the meantime you could try using the SAXON one. You can use it independently of the rest of SAXON. I realise this isn't exactly what you want because you want to create an empty document; but you could do this by starting from an InputSource with a minimal XML document. (A well-formed XML document, of course, always contains at least one element, so I can see products legitimately objecting to you creating a document that contains no elements). Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Apr 20 19:26:47 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:25 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available Message-ID: <93CB64052F94D211BC5D0010A80013310EB415@WWMESS3.172.19.125.2> > > On Mon, Apr 19, 1999 at 10:02:56PM -0400, David Megginson wrote: > > XSL provides one good and very powerful model for doing XML > > transformations, but that model itself requires that an entire > > document be held in random-access storage of some sort (say, memory, > > or a database) during processing, and that's inappropriate for the > > very large subset of XML work that is both speed- and > memory-critical. > > From: Eric Prud'hommeaux [mailto:eric@w3.org]> > I'd love to differ with you here. In practice, I can't, but in > theory... I have this itch to work out and implemnt an XSL parser that > works as as a SAX stream. Given an XslStream that reads the parsed > stylesheet from an XslDB and has an output SAX stream $this->{OUTPUT}, > the notion is something like this: SAXON provides SerialStyleSheet, an implementation of a subset of XSL that processes the source document serially using SAX, and doesn't create the document in memory. You can do some quite useful transformations with it. In particular, you can break a big document up into chunks that will fit in memory more comfortably; you can also do document subsetting/pruning, and at least as much styling as CSS allows. The benefits are in memory use rather than speed, though I think it could be speeded up considerably. Coming in the next version is an XSL compiler (generates a Java application) which certainly helps with the speed factor. (Java not Perl: sorry for intruding on a Perl discussion.) Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Tue Apr 20 19:30:11 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:25 2004 Subject: buffering XML and XSL (was Re: Streaming XSL Stylesheets) Message-ID: <199904201903.MAA09207@0000000.com> I just caught the tail-end of this discussion which was interesting. DXML (my framework) does not normally buffer anything but can effectively process the XML/XSL documents I'm working with. I am targeting specific areas though, not trying to be 100% compliant with the specification. For what I'm doing, non-buffering of XML/XSL is doable and beautifully suited to this framework's purpose. Additionally, I'm including the ability to turn buffering on, so that everything is retained in memory. Mainly I'm doing this so that I can learn about sets of documents that may require me to buffer information. Additionally it's just nice to have the ability to keep everything in System memory, for obvious reasons. So in a nutshell, I am processing files something like this: 1. Read an XML file until/unless you get to the XSL file callout. 2. Read the XSL file, store the program. 3. Continue reading the XML file. Look for XML tags that satisfy the program's requests. 4. Stop when XML file data has been read completely. 5. Throw away XSL program buffers and XML data. Some of you may flame me on my approach. Know that it's targeted at embedded processors where processing power and memory are limited. Of course it compiles on Linux/Solaris as well so I am interested in scaling up when possible. (It's all in C.) I am worried about nested stylesheets and various other usages of XSL that may escape me. Currently I can't handle those well and I would like to. XSL looping structures and conditionals are not a real problem though using a non-buffered approach and at the very least I have a set of tools that useful for dynamic web-page creation on my platforms... which was the original goal. Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Apr 20 20:06:02 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:11:25 2004 Subject: How to use binary data with XML? Message-ID: <01BE8B68.897BBC80@grappa.ito.tu-darmstadt.de> Roger L. Costello wrote: > (1) Binary data isn't actually put in an XML file, correct? i.e., the > binary data is _not_ inline, correct? An XML file just contains ASCII > text, correct? Yes, XML documents contain text. (This is not limited to ASCII -- they can directly contain any character included in the encoding declared by the XML declaration or in encoding declarations in external entities and can indirectly contain other characters through character references.) One common way suggested to inline binary data is to encode it as Base64. > (2) Just like in HTML, binary data is _referenced_ by the XML document, > correct? Except as noted above, correct. > (3) Is this the correct way of using binary data: > > [example snipped] > > Here we see an XML document _referencing_ a file containg binary data. > The binary data is not actually inline. Is this how it's done? /Roger Yep -- that's it. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Apr 20 20:11:50 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:25 2004 Subject: recursion in XML parser Message-ID: <87256759.0063CE56.00@d53mta03h.boulder.ibm.com> >Are most XML parsers recursive in nature? >My parser in non-recursing while processing the tags from an XML file >and only recurses once to go back and load an XSL file, when applicable. > There are some places where mine is recursive, e.g. parsing and otherwise manipulating the tree like content models in the DTD. But no, there is not really much reason to recurse while parsing XML. We also do recurse while handling INCLUDE/IGNORE sections, because it simplies some things (and hopefully no one is going to have 100 nested levels of INCLUDE/IGNORE, and if they do I'm not too worried about impressing them :-) >My reasoning for not using recursion was performance (function call/stack framing >considerations) and that it made the code easier to understand. > Its also more than that. Once you get into certain things, you might have to search back down the element nesting tree in a very efficient way, and having it on the stack would make that pretty difficult. Having your own element stack (and probably they all do), makes it straightforward. >It would be interesting to do some benchmarks on various parsers out there to >measure performance. The Java parsers I've tested (Sun, IBM) are _dog_ slow >compared to expat, etc. For server-side I don't think that matters, since in >the corporate scene people tend to just add more servers/infrastructure and >not worry about performance. > In our own defense, our second generation parsers are not dog slow compared to anyone, though they are still slower than Expat. But there are other reasons than pure sloth for this. Large companies like us have to consider extensibility and the ability to serve many masters. So we cannot write a highly compact parser that can fly but not be easily extensible. We have to support gazzillions of encodings, and many different configurations. When the blessed schema arrives, we have the architecture to handle it without rewriting it and without affecting our customers. That's extremely important and it has a certain amount of associated cost. If you can write a parser that's as fast as Expat and as flexible as ours, I'm sure you'll be a hero to your people, and I'll buy you a six pack of your favorite import. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eriblair at mediom.qc.ca Tue Apr 20 20:35:18 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:11:25 2004 Subject: MSXML uses with new IE5 VM ... Message-ID: <003b01be8b5c$b046e1c0$1f9ccb84@grr.ulaval.ca> Hi, I had utilized MSXML in my application without problems until upgrade for IE5 and the latest version of his virtual machine ... I had read that the MSXML is now integrate in the VM ... but when I'd try to load my files ... I have the error (... is not an object !!!) The definition of the objects look like: How can I resolve the problem ... OK ... I'd just solve a part of the problem ... I'd uninstall the old version of MSXML ... and now ... I had no error ralative to the objects but IE just tell me "error on the page" ... but do nothing else ?ric Riblair, Agronome (eriblair@mediom.qc.ca) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990420/18b3de0d/attachment.htm From roddey at us.ibm.com Tue Apr 20 20:37:27 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:25 2004 Subject: recursion in XML parser Message-ID: <87256759.00661E50.00@d53mta03h.boulder.ibm.com> > > I think that nobody would argue that Java has a lot of virtues that > > certainly speed of not one of them. To take your numbers David, If > > that part of the application is 10 times faster than any Java > > parser and that the app itself is 10 times faster also. The overall > > throughput is therefore 10 times faster. > >That's not necessarily the case -- C/C++ have some advantages for fast >I/O that Java doesn't share, but if your other code is not I/O-bound, >and if it doesn't require small, tight processing loops, the speed >difference for the non-parsing code might be much less significant >(depending on how efficient your VM and OS are at memory-management). > And startup time is an issue as well. If you must parse lots of little files, the overhead of Java's frameworks and the problems with initialization of data and whatnot can cost you getting out of the gate. We've noticed significantly better 'get it up, get it out, get it down' performance on C++, which can be a concern in the e-bidness area. Another issue we've noticed is that object creation (regardless of the downstream issues of cleaning them up) has a pretty darn high cost in Java, making it difficult to do high performance code that is as OO as one might wish sometimes. If performance is the Holy Grail, and it seems to be the primary measurement made (probably because its easy to measure, not because its what's important), then there can be a pull to write less maintainable code to gain performance. A language with less overhead for use of objects (at both ends of the the lifecycle) can sometimes do better in that regard. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 20 20:46:50 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:25 2004 Subject: Linking DTD's In-Reply-To: <9A4DF69E3C5ED211B86400A0C9D17760BB88C8@thor.operations.bluestone.com> References: <9A4DF69E3C5ED211B86400A0C9D17760BB88C8@thor.operations.bluestone.com> Message-ID: <14108.51751.405582.969806@localhost.localdomain> Campana, Sal writes: > Is it possible to nest DTD's? I would like to build a DTD from smaller > DTD's...Is this possible? Yes, but it's generally brittle -- about the same level of difficulty as doing object-oriented programming in C or BASIC (or, to remove the sexist part of an 18th-century quip, it's like a dog standing on its hind legs: it's not done well, but it's quite astounding that it's done at all). You've got to use lots of parameter-entity trickery if you want to be able to extend the content models of what you're inheriting. > Can someone send me an example or a location which contains an example of > this... Look at the full TEI DTDs. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Apr 20 21:09:22 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:25 2004 Subject: How to use binary data with XML? References: <01BED2BE.3B707EC0@EAMON> <371CB752.DA7CA866@mitre.org> Message-ID: <371CD0B0.3BE941A3@locke.ccil.org> Roger L. Costello wrote: > (1) Binary data isn't actually put in an XML file, correct? i.e., the > binary data is _not_ inline, correct? An XML file just contains ASCII > text, correct? Correct in essence. However, for "ASCII" read "characters, expressed in Unicode or some subset of it". Of course, ASCII *is* a subset of Unicode. > (2) Just like in HTML, binary data is _referenced_ by the XML document, > correct? Typically. Nothing prohibits you from encoding binary data as Base64 and incorporating it into an XML document, but nobody has converters for that yet either. > (3) Is this the correct way of using binary data: [example snipped] > Here we see an XML document _referencing_ a file containg binary data. > The binary data is not actually inline. Is this how it's done? That is the fully standards-compliant way. Application conventions such as XLink (when it stabilizes) can simplify this process by allowing in-line URIs and letting the network infrastructure provide the remote document-type. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eriblair at mediom.qc.ca Tue Apr 20 21:12:18 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:11:25 2004 Subject: MSXML uses with new IE5 VM ... forget the previous message ... Message-ID: <007501be8b61$e17851c0$1f9ccb84@grr.ulaval.ca> Please ... Forget the previous message ... Hi, I had utilized MSXML in my application without problems until upgrade for IE5 and the latest version of his virtual machine ... I had read that the MSXML is now integrate in the VM ... but when I'd try to load my files ... I have the error (... is not an object !!!) The definition of the objects look like: How can I resolve the problem ... ?ric Riblair, Agronome (eriblair@mediom.qc.ca) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990420/ff18c074/attachment.htm From jborden at mediaone.net Tue Apr 20 22:29:44 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:25 2004 Subject: How to use binary data with XML? Message-ID: <08a001be8b6b$6a934330$0b2e249b@fileroom.Synapse> Roger L. Costello wrote: >I have a few questions on using binary data (e.g., gif, jpeg images) >with XML. > >(1) Binary data isn't actually put in an XML file, correct? i.e., the >binary data is _not_ inline, correct? An XML file just contains ASCII >text, correct? Binary data can be embedded inline using base64 encoding. I have placed a demo of a MIME -> XML converter, XMTP, which demonstrates this at test-xmtp@jabr.ne.mediaone.net. If you send an E-mail with a binary image attachment, it will convert the multipart MIME message into XML (with inline binary parts) and E-mail you back the response. > >(2) Just like in HTML, binary data is _referenced_ by the XML document, >correct? > This article discusses uses of binary data and XML. http://www.xml.com/xml/pub/98/07/binary/binary.html >(3) Is this the correct way of using binary data: > >DTD: > > > > > > > > > > src ENTITY #REQUIRED > desc CDATA #IMPLIED> > >XML: > > > desc="Map of Boston"/> > > >Here we see an XML document _referencing_ a file containg binary data. >The binary data is not actually inline. Is this how it's done? /Roger > With src refers to a link to an external binary data object. It is the job of the software which traverses the link to get you the data. The URL http://www.maps.com/boston.gif will be identified by its NOTATION internally as well as via its MIME Content-Type: when the link is traversed via the HTTP protocol. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Apr 20 22:34:53 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:11:25 2004 Subject: Use of Tags Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF36E@RED-MSG-08> Regarding use of elements versus attributes, Andy Dent wrote "The path that Microsoft seem to be following with XML-Data is to use elements ... My single biggest problem with this is the reuse of elements within other elements - you can't define an element with local 'scope'. What happens when Amount is an i2 in one context and a float in another?" At http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html you'll find a description of a style of using XML in which attributes play a major role, specifically to avoid the problem you mention with local scope. This particular style is designed for representing graphs of typed objects in named relations using currently-available tools and technology. If Microsoft's advocacy of this seems less than dogmatic, it is because other contexts may reasonably call for other styles. Best wishes, Andrew Layman Architect Microsoft xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Tue Apr 20 22:58:15 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:25 2004 Subject: Linking DTD's References: <9A4DF69E3C5ED211B86400A0C9D17760BB88C8@thor.operations.bluestone.com> Message-ID: <371CEA34.4F969A05@pacbell.net> "Campana, Sal" wrote: > > Is it possible to nest DTD's? I would like to build a DTD from smaller > DTD's...Is this possible? > > Can someone send me an example or a location which contains an example of > this... http://www.w3.org/TR/xhtml-modularization/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Tue Apr 20 23:04:42 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? References: <01BED2BE.3B707EC0@EAMON> <371CB752.DA7CA866@mitre.org> <371CD0B0.3BE941A3@locke.ccil.org> Message-ID: <371CEBB6.BFED96C@pacbell.net> John Cowan wrote: > > > (2) Just like in HTML, binary data is _referenced_ by the XML document, > > correct? > > Typically. Nothing prohibits you from encoding binary data as > Base64 and incorporating it into an XML document, but nobody has > converters for that yet either. Actually, I've seen bunches of those; there's no difficulty in writing them. Just define a way to tag the binary data as whatever you wanted ... maybe something like ... Base 64 encoded content ... There's an article on www.xml.com from last year sometime on this topic; I don't recall how technical it is, you should be able to make something like the above " element work quite easily. Some folk prefer such "inlined" data for its improved latency; embedding references might lead to better storage structures. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Wed Apr 21 00:34:33 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:11:26 2004 Subject: Linking DTD's References: <9A4DF69E3C5ED211B86400A0C9D17760BB88C8@thor.operations.bluestone.com> Message-ID: <371D00D0.41C7464D@allette.com.au> "Campana, Sal" wrote: > Is it possible to nest DTD's? I would like to build a DTD from smaller > DTD's...Is this possible? Yes, though you might have an easier time getting your head around it if you look at it from the other direction. Rather than building a DTD out of blocks, you might think of it as replacing blocks in an existing structure, which has the effect of enabling or disabling other blocks. We used this approach when we did the Australian CALS suite of 27 DTDs - start with what is essentially a library of declarations and relationships and then use the DTDs to specify variations from that structure. As David M indicated, this is a bear to manage and requires parameter entities to be used throughout, though we overcame this by writing an application that generated normalised DTDs for applications and humans. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at w3.org Wed Apr 21 01:02:13 1999 From: chris at w3.org (Chris Lilley) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? References: <01BED2BE.3B707EC0@EAMON> <371CB752.DA7CA866@mitre.org> Message-ID: <371D0606.13226E43@w3.org> "Roger L. Costello" wrote: > > I have a few questions on using binary data (e.g., gif, jpeg images) > with XML. > > (1) Binary data isn't actually put in an XML file, correct? i.e., the > binary data is _not_ inline, correct? Not usually, although it is arguable (in terms of document object modelks) that notation does exactly that; however in practice no, you don't physically put jpeg-compressed bytes into the middle of your document instance. > An XML file just contains ASCII text, correct? No, and XML file contains Unicode text, which is one reason why lieral insertion of binary data would be difficult. > > (2) Just like in HTML, binary data is _referenced_ by the XML document, > correct? That is the most sensible way to do it, It allows independent revision, and independent re-use, and facilitates cacheing. > > (3) Is this the correct way of using binary data: > > DTD: > > > > > > > > > > src ENTITY #REQUIRED > desc CDATA #IMPLIED> > > XML: > > > desc="Map of Boston"/> > Its one way. It works in some viewers. Its rather sub-html, though - there is no content, not even a measly alt attribute, let along actual structured text alternatives. -- Chris xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Wed Apr 21 02:54:48 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? Message-ID: <199904210227.TAA09712@0000000.com> Hmm... so when Sun talks about serializing objects in XML (such as in http://www.javasoft.com/xml/ncfocus.html), they must be talking about storing binary data as well. Using mime attachments and such is so inefficient... would they really opt for storing binary object attributes using text-encoding approaches? Given a decent editor, such as the ones in NeXTStep/Mac OS X Server, you can copy/paste binary data from XML files trivially, while in Windows and standard Unix you're pretty much stuck with editing in either text or binary modes. Since I use the former, I'm used to seeing binary and text mix and if the binary was inside of a (predictable) XML tag that would be just dandy! Granted on other platforms it would be a problem to edit such files, but should file editing concerns dictate how efficiently data is stored? Thomas xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Apr 21 03:41:13 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? In-Reply-To: <199904210227.TAA09712@0000000.com> Message-ID: <002601be8b96$f6997540$1b19da18@ne.mediaone.net> xml wrote: > Using mime attachments and such is so inefficient... would they really > opt for storing binary object attributes using text-encoding approaches? MIME parts can be and often are binary. For example when you view a GIF or JPEG image on the Web (i.e. using HTTP), they are transmitted quite efficiently as a MIME message of Content-Type: image/jpeg and Content-transfer-encoding: binary. It is SMTP protocol that results in transmission with a Content-transfer-encoding: base64 though ESMTP can handle binary data as well. > > ...I'm used to seeing binary and text mix and > if the binary was inside of a (predictable) XML tag that would be > just dandy! > Dandy, but something else than XML... this can be your own private data format. Oh darn that inconvenient spec :-) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Apr 21 04:02:54 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:26 2004 Subject: Use of Tags In-Reply-To: <5BF896CAFE8DD111812400805F1991F708AAF36E@RED-MSG-08> Message-ID: <002701be8b9a$063ba830$1b19da18@ne.mediaone.net> Andrew Layman wrote: > Regarding use of elements versus attributes, Andy Dent wrote "The > path that > Microsoft seem to be following with XML-Data is to use elements ... My > single biggest problem with this is the reuse of elements within other > elements - you can't define an element with local 'scope'. What > happens when > Amount is an i2 in one context and a float in another?" > > At http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html > you'll find > a description of a style of using XML in which attributes play a > major role, > specifically to avoid the problem you mention with local scope. > This is all very reasonable. I think I'm missing something here in regard to what you mean by the term 'local scope'. Would this problem be solved by the use of the dt:dt attribute e.g. 34 34 In fact, this is my biggest problem with using attributes to hold property values: how ought these values be themselves typed? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Apr 21 04:20:00 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? References: <199904210227.TAA09712@0000000.com> Message-ID: <371D2F51.8F797E5E@prescod.net> xml wrote: > > Granted on other platforms it would be a problem to edit such files, > but should file editing concerns dictate how efficiently data is > stored? YES! Completely. It is precisely ease of text editing on common platforms that makes XML so popular. Otherwise XML could be completely binary and much more compact. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Apr 21 10:49:37 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:26 2004 Subject: How to use binary data with XML? Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A17DE@eukbant101.ericsson.se> > -----Original Message----- > From: David Brownell [SMTP:david-b@pacbell.net] > > John Cowan wrote: > > > > > (2) Just like in HTML, binary data is _referenced_ by the XML > document, > > > correct? > > > > Typically. Nothing prohibits you from encoding binary data as > > Base64 and incorporating it into an XML document, but nobody has > > converters for that yet either. > > Actually, I've seen bunches of those; there's no difficulty in > writing them. Just define a way to tag the binary data as > whatever you wanted ... maybe something like > > > ... Base 64 encoded content ... > > I used when I implemented this - I seem to recall that was suggest by Tim Bray... (sorry Tim if it wasn't you!). I'll try and dig out the email where he (?) mentioned that this might be one of the things considered for XML 2. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Wed Apr 21 11:32:24 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB414@WWMESS3.172.19.125.2> Message-ID: <001501be8bd9$e9f898a0$ab20268a@pc-lrd.bath.ac.uk> > Yes it would be nice if there were a standard interface, but in > the meantime you could try using the SAXON one. You can use it independently > of the rest of SAXON. I'm probably going to use OpenXML as that provides the functionality I need. But I still maintain that the creation of a Document should be in the DOM spec - granted that an empty document is not valid XML, but that in itself doesn't stop its addition, its just something to think about. Perhaps the Document object must be created from a DocumentFragment (say) or other Node - ensuring that some initial content must be specified, and throw an exception if this content is (in itself) invalid. e.g. Documment createDocument(Node initialContent); It just seems like an obvious omission, and I've yet to come across any information as to why its not provided. I'm really interested to hear what people think (not that I seriously expect any changes to the spec from just my suggestion, but curiosity being what it is...). L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Apr 21 13:18:58 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents Message-ID: Leigh Dodds wrote, > Why doesn't the Document Object Model have a > createDocument method to allow the creation of a new > Document instance? The reason is that the DOMs interfaces were defined in CORBA IDL, the intention being to ensure language neutrality. The problem is that, in the abscence of the whole CORBA infrastructure, there's no way getting hold of a concrete instance of a CORBA interface (unless you've already got one that defines a factory method ... but then we're on the first step of an infinite regress). So, there's no DOM-vendor-independent mechanism for document creation at the mo' ... there _might_ be in Level 2, but there's a lot of tricky issues that need to be resolved before that can happen. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Wed Apr 21 13:41:17 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents In-Reply-To: Message-ID: <002501be8beb$eae1e200$ab20268a@pc-lrd.bath.ac.uk> > The problem is that, in the abscence of the whole > CORBA infrastructure, there's no way getting hold of a > concrete instance of a CORBA interface (unless you've > already got one that defines a factory method ... but > then we're on the first step of an infinite regress). Erm, so you're saying that interface Document : Node { Document createDocument{} raises {DOMException} } isn't valid CORBA syntax and so thats why its not been included? Surely the use of IDL is to provide a clear language neutral specification and not limit that specification in anyway, nor tie it to a CORBA infrastructure. The OpenXML parser, and the SAXON interface mentioned in an earlier post, both manage to create new Document instances so obviously there is a means of doing it. Are potential implementation difficulties actually driving the format of the specification (i.e. interface). Seems a roundabout way of doing it if it is. Perhaps I'm missing something. L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Apr 21 13:57:35 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents Message-ID: Leigh Dodds wrote, > Miles Sabin wrote, > > The problem is that, in the abscence of the whole > > CORBA infrastructure, there's no way getting hold of > > a concrete instance of a CORBA interface (unless > > you've already got one that defines a factory method > > ... but then we're on the first step of an infinite > > regress). > > Erm, so you're saying that > > interface Document : Node { > Document createDocument{} raises {DOMException} > } > > isn't valid CORBA syntax and so thats why its not been > included? No, that's perfectly valid CORBA IDL. But, now tell me how you get hold of a concrete instance of Document so you can call that factory method? If you've already got one, then you've already solved the problem. If you haven't then Document.createDocument() only defers the problem by one step. Without CORBAs object lifecycle service there's no way of having the equivalent of, interface Document : Node { Document(); // constructor ... not valid IDL // etc. } We could have a DocumentFactory interface that looked like this, interface DocumentFactory { Document createDocument(); } But then how do we get hold of a concrete instance of this factory interface so that we can call the method? We can't. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Apr 21 15:01:32 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents Message-ID: Richard Anderson wrote, > Miles Sabin wrote, > > But then how do we get hold of a concrete instance > > of this factory interface so that we can call the > > method? We can't. > > I think W3C spec is 100% right in not specifying > this. I think the W3C spec is 80% right in not specifying it: language neutrality is a major goal. Unfortunately it's not quite as clear cut as all that, and your code fragments illustrate the problem. > dim doc as DOMDocument > set doc = new DOMDocument > DOMDocument *pDoc; > pDoc = createNewDOMDocument(); > CComPtr spDoc; > spDoc.CreateInstance(...) > Any reusable code starts from the point where it is > passed a DOMDocument pointer or reference. This code isn't vendor neutral: it relies on a DOM vendor specific (MS I presume) Document creation mechanism. Vendor neutrality is another important goal, and that's not addressed by the Level 1 spec. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Apr 21 15:25:55 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents References: Message-ID: <371DCE76.264D5147@prescod.net> Miles Sabin wrote: > > > We could have a DocumentFactory interface that looked > like this, > > interface DocumentFactory > { > Document createDocument(); > } > > But then how do we get hold of a concrete instance of > this factory interface so that we can call the method? > We can't. The non-standard line of code to get a single, reusable DocumentFactory would be invoked once per program. The code to generate new documents could be in many different places and might be invoked over and over again. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Apr 21 15:42:00 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents Message-ID: Richard Anderson wrote, > With COM you can write code this like to create a > Microsoft DOM document: > > dim doc as DOMDocument > set doc = CreateObject("MSXML") > > You can also write could to use our parser: > > dim doc as DOMDocument > set doc = CreateObject("VCXML") > > If I just put this string in the registry and read it > at the start of my program, I can then change the XML > parser vendor without a recompile, provided the same > interfaces are supported. Agreed, and what'd be nice would be a language by language generalization of this sort of mechanism, ideally to move the binding to a DOM implementation out of the code and into the environment. This can be done for COM as you've demonstrated, it can also be done for Java, and probably for any language/platform which supports dynamic loading. The crucial thing is that all COM DOMs use the same mechanism (including some agreement on Win registry keys), all the Java DOMs do likewise (including the specification of a Java system property), and ditto for all the other implementation languages which can support such a mechanism. If that's going to happen at all it'll have to happen under the auspices of the DOM WG. Note tho', that this involves stepping outside of IDL to give us an entry point to bootstrap into the main DOM interfaces. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Apr 21 15:44:26 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents Message-ID: Paul Prescod wrote, > The non-standard line of code to get a single, reusable > DocumentFactory would be invoked once per program. Indeed, but one line is one too many. It means that to configure a system for a different DOM implementation you have to edit source code and recompile. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Apr 21 16:04:48 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:26 2004 Subject: DOM - Creating Documents References: <002501be8beb$eae1e200$ab20268a@pc-lrd.bath.ac.uk> Message-ID: <371DDAD6.B44F092F@locke.ccil.org> Leigh Dodds wrote: > Erm, so you're saying that > > interface Document : Node { > Document createDocument{} raises {DOMException} > } > > isn't valid CORBA syntax and so thats why its not been > included? Not at all. But what such a method would do is to allow you to create a new Document, *given an existing Document*. It doesn't help unless you already have a Document to apply the method to. IDL has no concept of "static" or "class" methods. A plausible alternative would be to put createDocument() into the DOMImplementation interface, but this just pushes off the problem: how does one get an instance of DOMImplementation? This is the "infinite regress" Miles mentioned. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Apr 21 16:11:12 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:27 2004 Subject: DOM - Creating Documents References: <371DCE76.264D5147@prescod.net> Message-ID: <371DDC56.E4AB46EC@locke.ccil.org> Paul Prescod wrote: > The non-standard line of code to get a single, reusable DocumentFactory > would be invoked once per program. The code to generate new documents > could be in many different places and might be invoked over and over > again. Unless, of course, you are clever enough to encapsulate it so it only has to appear once --- as is standard practice in dealing with nonstandard extensions anyway. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Apr 21 16:26:07 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:11:27 2004 Subject: MSXML uses with new IE5 VM ... forget the previous message .. . Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0F3627DD@RED-MSG-56> ?ric, why don't you try setting the width and height of the applet to 100% and 50 respectively, and then tell me whether you see a gray box (class not found by VM) or a green box (applet loaded ok) or a red box (some problem loading the XML) and what the message inside is. Thanks. [Chris Lovett] -----Original Message----- From: ?ric Riblair [mailto:eriblair@mediom.qc.ca] Sent: Tuesday, April 20, 1999 12:14 PM To: List XML Subject: MSXML uses with new IE5 VM ... forget the previous message ... Please ... Forget the previous message ... Hi, I had utilized MSXML in my application without problems until upgrade for IE5 and the latest version of his virtual machine ... I had read that the MSXML is now integrate in the VM ... but when I'd try to load my files ... I have the error (... is not an object !!!) The definition of the objects look like: How can I resolve the problem ... ?ric Riblair, Agronome ( eriblair@mediom.qc.ca ) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990421/a2c38a0e/attachment.htm From paul at prescod.net Wed Apr 21 16:36:51 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:27 2004 Subject: DOM - Creating Documents References: Message-ID: <371DDBCA.6512256F@prescod.net> Miles Sabin wrote: > > Paul Prescod wrote, > > The non-standard line of code to get a single, reusable > > DocumentFactory would be invoked once per program. > > Indeed, but one line is one too many. It means that to > configure a system for a different DOM implementation > you have to edit source code and recompile. I wasn't very clear in my last message: I meant that for DOM 1.0 it would have been better to require this one line to be nonstandard instead of requiring such a central feature as document creation be nonstandard. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Apr 21 17:05:18 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:27 2004 Subject: DOM - Creating Documents References: <371DCE76.264D5147@prescod.net> <371DDC56.E4AB46EC@locke.ccil.org> Message-ID: <371DE09D.8760A8C6@prescod.net> John Cowan wrote: > > Unless, of course, you are clever enough to encapsulate it so > it only has to appear once --- as is standard practice in dealing > with nonstandard extensions anyway. Even so the non-standardness infects all of code that uses it: doc = myCreateDocumentIndirectionDevice( foo ); Not a big deal but I think that the DOM people still made a minor mistake in the place they chose to stop the "infinite regress." Even if you are calling a wrapper function it is better to call a wrapper in one place instead of many. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From balba at eancol.org Wed Apr 21 17:31:20 1999 From: balba at eancol.org (Bernardo Alba) Date: Mon Jun 7 17:11:27 2004 Subject: Basic Information about XML Message-ID: > I'm new in XML, but i want to know the basic information abour XML and > Electronic Data Interchange. > Sombody can help me, where can i search this kind of information? > > Thanks. > > VISITE NUESTRO WEB http://www.eancol.org > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Apr 21 17:35:32 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:11:27 2004 Subject: Use of Tags Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF37B@RED-MSG-08> Jonathan Borden asks what "local scope" means. Consider the example The Virgin Queen Queen Elizabeth Queen Raleigh The elements "title" and "subject" have different meaning in the two cases. When the meaning, definition, etc. of a name is determined by its enclosure, that is, when the thing the name denotes is determined by its enclosure, that is "local scope". xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Wed Apr 21 17:39:46 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:27 2004 Subject: How to use binary data with XML? References: <199904210227.TAA09712@0000000.com> Message-ID: <371DF113.C5246D85@pacbell.net> xml wrote: > > Hmm... so when Sun talks about serializing objects in XML > (such as in http://www.javasoft.com/xml/ncfocus.html), they must be > talking about storing binary data as well. I couldn't find the word "serializing" in that article. However, Sun is certainly investigating standard ways to encode objects in XML. There are two basic approaches: "XML first", where the data format is defined and the problem is how to bind that XML text to some Java object; and "Java first", where the objects are defined first (only for Java), and get archived to XML text using some specialized format. Clearly, both of those need to address binary data. But that's known to have lots of solutions. > Given a decent editor, such as the ones in NeXTStep/Mac OS X Server, > you can copy/paste binary data from XML files trivially, while in > Windows and standard Unix you're pretty much stuck with editing in > either text or binary modes. You can't just insert raw binary data in XML text, as a rule; you must handle situations such as illegal characters (NUL and most control characters, unpaired surrogates, etc) and syntactically significant characters ("&", "<", "]]>") in the data. It needs to be encoded to avoid such situations ... and also to address the fact that not all character encodings support, like UTF-8 or UTF-16, the full repertoire of XML characters. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Apr 21 19:03:58 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:27 2004 Subject: Use of Tags Message-ID: <098201be8c17$de601d90$0b2e249b@fileroom.Synapse> Andrew: >Jonathan Borden asks what "local scope" means. Consider the example > > > The Virgin Queen > Queen Elizabeth > > > > Queen > Raleigh > > >The elements "title" and "subject" have different meaning in the two cases. >When the meaning, definition, etc. of a name is determined by its enclosure, >that is, when the thing the name denotes is determined by its enclosure, >that is "local scope". > > Ok fine, then my question stands: Isn't the dt:dt attribute a perfectly good way to distinguish value types *despite* local scope intentions e.g. The Virgin Queen Queen Elizabeth 12.95 Queen Raleigh historical > And if this objection to the use of elements is sufficiently handled in this fashion, are there other pressing reasons why values (particularly table columns) ought be expressed as attributes rather than elements? To me the most pressing reason to use elements to encode "recordsets" is that in the absense of a schema decl or DTD, the content can be correctly interpreted. For example when converting to and from SQL, "int" values though textualy represented are not 'quoted' while character values are e.g.: UPDATE table SET intval = 3, charval='3' or SELECT FROM table WHERE intval = 3 OR charval = '3' Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Wed Apr 21 19:18:14 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:27 2004 Subject: binaries in xml schema? Message-ID: <199904211851.LAA10790@0000000.com> Knowing that the data format will be used to store binaries anyway in embedded markets, I have to wonder if there is a formalized way to solve the problems in embedding binary in XML documents. Because it is going to happen... there can't be any question about that. I understand that a schema mechanism is being added to XML. I don't know anything about it... haven't been able to find anything yet. But, if a schema is available in the file with the XML datasets, several binary datatypes should be able to be called out. This would make the MIME-happy folks happier and help us build xml-compliant consumer electronic blocks that are affordable. The concept of binary data in XML files has been brought up before countless times and seems to be a hot topic. The XML-server folks don't like it 'coz well... their platform may have half a gigabyte of RAM to play with. The client guys tend to stay quiet 'coz they know better... we're working with platforms that span the range of products from Tamagotchi to Win32. We have to be careful with memory utilization and also need the ability to execute code directly out of a memory pool (in XML). So... is there any public information about XML-Schema? Any idea if one or more binary formats is called out as a datatype within it? T xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gtn at eps.inso.com Wed Apr 21 19:19:13 1999 From: gtn at eps.inso.com (Gavin Thomas Nicol) Date: Mon Jun 7 17:11:27 2004 Subject: multiple encoding specs (Re: IE5.0 does not conform to RFC2376) In-Reply-To: <371B84B5.C39EA23C@w3.org> Message-ID: <000301be8c1a$7281cc60$f8d45dc7@eps.inso.com> > > a) You have to fix it by parsing a peice of arbitrary syntax, which > > proxies etc. will most likely not do, for performance reasons. > > Now in a different message you were saying that cacheing the > results of parsing the encoding declaration was not worth it because the effore > required to re-parse it each time was minimal. So I donm't see how you > can now have it be a performance hit. In proxies, the cost/complexity ratio is very different. > Well in theory yes, but in practice the advantages seem to me to > outweigh the disadvantages. > > If someone cares enough about an XML document that they think > a changed encoding declaration has destroyed its value (eg, a digitally signed > transaction encoded in XML) then they don't want any dumb - or even > smart - proxies merrily changing from UTF-8 to 8859-2 or whatever > either. The problem is that you can't assume smart proxies. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Apr 21 19:40:51 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:27 2004 Subject: Use of Tags References: <098201be8c17$de601d90$0b2e249b@fileroom.Synapse> Message-ID: <371E0D6A.C85EBDA0@locke.ccil.org> Jonathan Borden wrote: > are there other pressing reasons why values (particularly > table columns) ought be expressed as attributes rather than elements? The only place where attributes *must* be used is where naive (i.e. HTML) browsers may be looking at the content, in which case attribute values are typically hidden and #PCDATA content is typically exposed. This is the main rationale for allowing RDF statements to use either attributes or elements. Attributes are also modestly more compact. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Apr 21 19:46:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:27 2004 Subject: binaries in xml schema? References: <199904211851.LAA10790@0000000.com> Message-ID: <371E0EED.5AF103B3@locke.ccil.org> xml wrote: > Knowing that the data format will be used to store binaries anyway in > embedded markets, I have to wonder if there is a formalized way to solve > the problems in embedding binary in XML documents. Because it is going to > happen... there can't be any question about that. Binaries will have to be encoded before being embedded, period. Anything else isn't XML. After all, what prevents the binary from containing < or &, never mind any of the illegal characters in the 00-1F range? > I understand that a schema mechanism is being added to XML. I don't know > anything about it... haven't been able to find anything yet. But, if a > schema is available in the file with the XML datasets, several binary datatypes > should be able to be called out. This would make the MIME-happy folks happier > and help us build xml-compliant consumer electronic blocks that are affordable. Sure. With or without schemas, you can tag elements as containing base-64 encodings of any MIME type you want. But there has to be an encoding, not raw binary bits. > So... is there any public information about XML-Schema? Any idea if one or > more binary formats is called out as a datatype within it? You can read the various W3C Notes: XML-Data, DCD, SOX, DDML. None of them tolerate raw binary. Several provide a datatype tag for base-64-encoded octets. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Wed Apr 21 21:00:57 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:27 2004 Subject: binaries in xml schema? Message-ID: <199904212034.NAA10930@0000000.com> I noticed that on the Microsoft site that they're certainly buying into this. There were some XML documents and documentation there for XML-schema types "bin-mime" and "bin-hex" for defining binary data in their xml files. I can just extend that to "bin-bin" for my purposes I s'pose. I'll just change it later in my codebase when something else gels. Thanks for all the help and feedback, T xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at 0000000.com Wed Apr 21 21:28:35 1999 From: xml at 0000000.com (xml) Date: Mon Jun 7 17:11:27 2004 Subject: XML Datatypes Reference Message-ID: <199904212101.OAA11005@0000000.com> Two people already asked me for info about the XML Schema types I noticed Microsoft is using. So... here is the link: http://msdn.microsoft.com/xml/schema/reference/datatypes.asp The datatypes I noticed relevant to our discussion were: bin.base64 MIME-style Base64 encoded binary BLOB bin.hex Hexadecimal digits representing octets. T xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hchen at netscape.com Wed Apr 21 23:35:42 1999 From: hchen at netscape.com (Hseu-Ping Chen) Date: Mon Jun 7 17:11:27 2004 Subject: A new XML star seems to get it Message-ID: <371E4768.4359EA8E@netscape.com> Hi, I'm a senior software engineer in Netscape's E-Commerce division. Part of my job is to constantly review the new technology, new E-Commerce sites emerging. Your articles give me a lot of help. And now I think I can give you some feedback, reminding you some good ideas, good technology that you seem to miss in your radar screen. Take a close look at the site http://www.upyp.com, which has the portal-style user-friendliness, but their catalog, is equippted with the best industrial power I have ever seen - keyword + parametric search, allowing me to precisely locate my desired product, among hundreds of vendors, thousands of products, in a flash. The most important, they solved the headache of business-to-business catalog interoperability, allowing automatic server-to-server catlaog exchange, all in XML, and it is working now! Unlike most other competitors claiming XML compliant but are mostly hype. It is to my amazement that they solved the problem in such a straight-forward way that everybody can easily accept and hook their catalogs into it. This is an Aha solution to me. This site is not very popular yet. But certainly it is a star on the horizon, with the right vision, powerful technology, and truely useful services. - Hseu-Ping Chen hchen@netscape.com Senior Software Engineer E-Commerce Division, Netscape ----------------------------------------------------------------------- 650-937-6152, hchen@netscape.com ----------------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dent at oofile.com.au Thu Apr 22 01:28:54 1999 From: dent at oofile.com.au (Andy Dent) Date: Mon Jun 7 17:11:27 2004 Subject: How to use binary data with XML? In-Reply-To: <371CEBB6.BFED96C@pacbell.net> References: <01BED2BE.3B707EC0@EAMON> <371CB752.DA7CA866@mitre.org> <371CD0B0.3BE941A3@locke.ccil.org> Message-ID: At 5:03 +0800 21/4/99, David Brownell wrote: >Some folk prefer such "inlined" data for its improved latency; >embedding references might lead to better storage structures. There's also the issue of using XML for a document format (such as we are working on with our report writer) where we MUST produce a single document and if the user embedded images in the original they will have to become base64-encoded elements. Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows PP2MFC - PowerPlant->MFC portability http://www.oofile.com.au/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Apr 22 02:17:33 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:11:28 2004 Subject: Use of Tags Message-ID: <0ab401be8c54$7096dff0$0b2e249b@fileroom.Synapse> I have been asked to clarify this paragraph: > > And if this objection to the use of elements is sufficiently handled in >this fashion, are there other pressing reasons why values (particularly >table columns) ought be expressed as attributes rather than elements? To me >the most pressing reason to use elements to encode "recordsets" is that in >the absense of a schema decl or DTD, the content can be correctly >interpreted. For example when converting to and from SQL, "int" values >though textualy represented are not 'quoted' while character values are >e.g.: UPDATE table SET intval = 3, charval='3' or > SELECT FROM table WHERE intval = 3 OR charval = '3' > I have recently developed as system to insert, update and query relational table schemas using XML documents and XSL transforms, this allows storage of XML documents in relational tables which are designed to efficiently model the underlying data hierarchy (that is a single XML document maps to perhaps many inserts, updates and selects on the underlying relational db. this technology is very similar to data shaping but is entirely based upon XML and XSL transforms). I have developed this for our medical records system, for those of you who are aware of the HL7 V3 object model, you are aware that it is a complex network of relationships. Patient demographics and other information is modelled as a graph which is serialized as XML for transmission over the web. The client side uses a cluster of transformations to map this graph onto various forms and web entry screens, the server side uses transformations to map this graph onto the underlying relational database model (the HL7 model). So in doing this, properly and efficiently representing relational data as XML is very important. I would love to use attribute whenever possible for efficiency concerns yet I appear to require elements for practical issues. I've read Andrew Layman's article on serializing graphs, and the recommendation that properties/arcs be attributes yet I'm not sure how to type these values in a way that my XSL processor can make sense of. Suppose I encode a recordset in the following fashion: In the absense of a schema which describes the recordset, how do I apply a type to 'a="3"' does this mean the string "3" or the number 3? For example, 3 ... 42 The value types are unambiguous. What I am getting at in regard to using SQL (which is a common task when dealing with recordsets) if you desire to create an XSL transform which converts recordsets into SQL statements, e.g. to INSERT a recordset into a table, you need to know what the datatype is. Both XML and SQL are text based but SQL has a different syntax for values which are intended to represent character strings vs. those which are intended to represent numbers, e.g.: INSERT tab (a,d) VALUES(3,'42') is the correct syntax whereas: INSERT tab (a,d) VALUES('3','42') generates an error if column 'a' is typed as an int. so 3 is generated as Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From james at xmltree.com Thu Apr 22 10:08:42 1999 From: james at xmltree.com (james@xmlTree.com) Date: Mon Jun 7 17:11:28 2004 Subject: No subject Message-ID: <009701be8c97$3f1fc6b0$0400a8c0@fourleaf.com> From: Hseu-Ping Chen To: Sent: Wednesday, April 21, 1999 10:47 PM Subject: A new XML star seems to get it > The most important, they solved the headache of > business-to-business catalog interoperability, allowing automatic > server-to-server catalog exchange, all in XML, and it is working > now! Unlike most other competitors claiming XML compliant but are Dear UPYP team, I am very interested in your XML - based ecommerce product, and in particular how it is possible to import and export the catalog using XML. Do you have any plans to provide a public XML interface to UPYP.com? If you offered a public interface to the contents of UPYP, it would be possible to increase the exposure of the products of vendors using UPYP by allowing affiliated sites of UPYP to build specialised marketplaces in niche areas. For example, a camera enthusiast's site could offer access to the digital camera section of the UPYP and channel customers through, as Amazon does with books and it's affiliate programs. An example of this database syndication is at http://www.xmltree.com/resource/detail.cfm/ContainerID/21/ResourceID/121. In this case, Allaire (which owns a database of Cold Fusion custom tags) has opened up the database using WDDX (a XML language), and now other developers are building their own interface to it (the original database is at http://www.allaire.com/developer/gallery/, and a 3rd party interface to the same data is at http://w3.i-us.com/ Hope that this stirs up some ideas. Best regards James Carlyle james@xmltree.com Wavefront Limited 70 Acton Street London WC1X 9NB UK (44) 171 813 0665 www.xmltree.com - directory of XML content on the web xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rmartin at uq.net.au Thu Apr 22 12:21:24 1999 From: rmartin at uq.net.au (Robyn & John) Date: Mon Jun 7 17:11:28 2004 Subject: XML Performance question References: <87256758.0067D35F.00@d53mta03h.boulder.ibm.com> Message-ID: <371EF793.D0186BF5@uq.net.au> roddey@us.ibm.com wrote: > Actually, IMHO, performance is critical, and its really hard to make a > fully conforming parser that is flexible, maintainable, and fast. Partly > its because some of the kind of arcane XML rules were not written with > performance in mind particularly. I think that performance matters as much > on the small end of the spectrum as the large as well. I agree. We have a pluggable serialisation system for passing Java objects across CORBA interfaces. The guys dumped the XML implementation after discovering deserialisation was 10 times as fast in the Java serialisation version. We tried IBM's latest version, and Sun's XML-TR1. John xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Apr 22 13:57:36 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents Message-ID: <01BE8CC7.D3A3CAE0@grappa.ito.tu-darmstadt.de> John Cowan wrote: > Unless, of course, you are clever enough to encapsulate it so > it only has to appear once --- as is standard practice in dealing > with nonstandard extensions anyway. OK, let's assume I am that clever -- a topic always open to debate. It looks like I've got the following options for creating DOM documents in Java. Comments/questions: 1) Docuverse requires you to pass a string, presumably the root element type. I assume the reason for this is so that the returned Document will represent a well-formed document. None of the other implementations appear to require this. 2) Is it really not possible to create a new Document with IBM's classes? I can't find a method for doing it. 3) Am I missing any Java DOM packages? -- Ron Bourret DOM Document Creation Methods ----------------------------- Docuverse: DOM.createDocument(String type) // returns Document DOMFactory.createDocument(DOM dom, String type) // interface. implemented by DefaultFactory, PrototypeFactory, HTMLFactory IBM (xml4j): not possible? Microsoft/Datachannel: com.datachannel.xml.om.Document.Document() // implements Document OpenXML: DOMFactory.createDocument(Class docClass) DOMFactory.createXMLDocument() Oracle: XMLDocument() // implements Document NodeFactory.createDocument() // returns XMLDocument Sun: XMLDocument() // implements Document xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From napoli at wwa.com Thu Apr 22 14:49:25 1999 From: napoli at wwa.com (S. Kritikos) Date: Mon Jun 7 17:11:28 2004 Subject: DTD for Virtual Communities Message-ID: Hi Since late 1996 I have been maintaining a mailing list with information about virtual communities (MOOs, mailing lists, portals etc). It would be greatly appreciated if you could send me information about any DTDs that you know regarding social studies. I am trying to assess whether developing a DTD for virtual communities is a good idea and potentially useful. Since new to XML and to this list, I apologize ahead of time if this post goes against the spirit of this list or the question has been asked before. Regards Sam Kritikos __________________________________________________________ The Virtual Community Mailing List http://admin.gnacademy.org:8001/uu-gna/text/vc/gna-vc.html __________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Thu Apr 22 16:41:01 1999 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:11:28 2004 Subject: ForeHelp and XML Message-ID: <199904221423.QAA28845@goofy.gr05.synopsys.com> This is quite possibly off-topic, but I found it interesting enough to pass on. For those who don't know it, ForeFront make a software package called ForeHelp for creating Windows Help and HTML-based Help. Their latest version is promoting a new implementation called InterHelp that (using Java applets) is equaly viewable in Netscape and Internet Explorer. Check out the demo at: http://www.ff.com/Examples/InterHelp/helpset.HTM and you will see a reasonable HTMLHelp implementation. Now, check out the base directory address in IE5 http://www.ff.com/Examples/InterHelp/ .... and you will see an XML file. It even has a DTD. Does anyone know anything more about this application? Simon. Simon North - Technical Writer, Synopsys Synopsys GmbH, Kaiserstr. 100, 52134 Herzogenrath Germany. +49 2407 955873 -- north@synopsys.com Voice mail: +1 415 694 4141 55055 "... play the pious innocent, And for an honest attribute cry out" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Apr 22 17:17:55 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents References: <01BE8CC7.D3A3CAE0@grappa.ito.tu-darmstadt.de> Message-ID: <003c01be8cd3$69dafd20$8cc3fea9@w21tp> Ronald, > 1) Docuverse requires you to pass a string, presumably the root element > type. I assume the reason for this is so that the returned Document will > represent a well-formed document. None of the other implementations appear > to require this. Actually, you are atttributing me with too much cleverness. Docuverse's DOM.createDocument was modeled after the portion of the DOM Level 1 spec that fell off the wagon before release. > 2) Is it really not possible to create a new Document with IBM's classes? > I can't find a method for doing it. Have you tried the constructor? I don't know much about XML4j but DataChannel's so called Most Advanced XML Parser uses constructor to create new Documents. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 22 17:49:22 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents Message-ID: <93CB64052F94D211BC5D0010A80013310EB427@WWMESS3.172.19.125.2> Ronald Bourret asked how do you create a Document using various products. For the record, here are the relevant drivers from SAXON: Datachannel: public Document build (InputSource source) throws java.io.IOException, org.xml.sax.SAXException { com.datachannel.xml.om.Document doc = new com.datachannel.xml.om.Document(); try { if (null != source.getByteStream()) { // byte stream supplied doc.loadFromInputStream(source.getByteStream()); } else if (null != source.getCharacterStream()) { // character stream supplied [horrible code and not tested!] Reader r = source.getCharacterStream(); StringBuffer sb = new StringBuffer(10000); char[] cbuf = new char[10000]; int bytes = 0; while (true) { bytes = r.read(cbuf); if (bytes<0) break; // end of file sb.append(cbuf, 0, bytes); } doc.loadXML(sb.toString()); } else { // URL supplied doc.load(source.getSystemId()); } } catch (Exception e) { throw new SAXException(e); } return doc; } Docuverse: public Document build(InputSource source) { DOM dom = new DOM(); dom.setReader(new SAXONFreedomDriver()); return dom.openDocument(source); } // inner class /** * * SAXONFreedomDriver
* Subclasses DOM-SDK's SAXReader class to use a supplied parser
* This class is used to interface SAXON with Docuverse * and is of no direct concern to applications. * */ private class SAXONFreedomDriver extends com.docuverse.dom.util.SAXReader { protected Parser createParser (com.docuverse.dom.DOM dom) { return givenparser; } } IBM: public Document build (InputSource source) throws java.io.IOException, org.xml.sax.SAXException { Document doc = null; com.ibm.xml.parser.Parser p; try { p = new com.ibm.xml.parser.Parser(source.getSystemId()); if (null != source.getByteStream()) { doc = p.readStream(source.getByteStream()); } else if (null != source.getCharacterStream()) { doc = p.readStream(source.getCharacterStream()); } else { doc = p.readStream( p.getInputStream( source.getSystemId(), null, source.getSystemId())); } } catch (Exception e) { p = null; throw new SAXException(e); } return doc; } ORACLE: public Document build(InputSource source) throws java.io.IOException, org.xml.sax.SAXException { oracle.xml.parser.XMLParser p = new oracle.xml.parser.XMLParser(); p.parse(source); return p.getDocument(); } SUN: public Document build(InputSource source) throws java.io.IOException, org.xml.sax.SAXException { XmlDocumentBuilder b = new XmlDocumentBuilder(); b.setDisableNamespaces(true); givenparser.setDocumentHandler(b); givenparser.parse(source); return b.getDocument(); } Interesting if nothing else for the sheer variety of different ways of achieving the same thing! Note some of these allow you to select your parser first, others don't. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Thu Apr 22 18:12:07 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents References: <01BE8CC7.D3A3CAE0@grappa.ito.tu-darmstadt.de> Message-ID: <0d8601be8cda$856c2990$a24630d1@trivida.com> Snarfed from the very useful guide.html that came with the IBM xml4j (version 1.1.9) distribution (the documentation has changed in later versions, and I haven't looked for the replacement.): Creating a new XML document is a three step process. First, create the TXDocument, which is the concrete implementation of Document. Second, create the various XML objects using the TXDocument's create* methods. Finally, use appendChild to append these appropriately to the TXDocument or sub-Elements. Make a TXDocument instance. TXDocument doc = new TXDocument(); Create something, a DTD or root Element. Element root = doc.createElement("ROOT")); Append the newly created Element. doc.appendChild(root); Append something to the root Element you have added. root.appendChild(doc.createElement("FOO")); ----- Original Message ----- From: Ronald Bourret To: 'XML-DEV' Sent: Thursday, April 22, 1999 4:55 AM Subject: Re: DOM - Creating Documents > > 2) Is it really not possible to create a new Document with IBM's classes? > I can't find a method for doing it. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ldodds at ingenta.com Thu Apr 22 18:15:40 1999 From: ldodds at ingenta.com (Leigh Dodds) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB427@WWMESS3.172.19.125.2> Message-ID: <001101be8cdb$65b5eae0$ab20268a@pc-lrd.bath.ac.uk> > Interesting if nothing else for the sheer variety of different ways of > achieving the same thing! Note some of these allow you to select > your parser first, others don't. It certainly demonstrates the neatness of a well-defined interface, as all that implementation specific complexity is hidden behind the small signature of the build() method. But that said, each of the listed methods presuppose the availability of an existing document, correct? Whereas I was initially concerned about the inability to generate a completely *new* document, but it has been demonstrated to me that this involves some implementation specific bootstrapping. And from the later discussions all I gather the W3C *could* specify is some standard registry/property keys/values that would feed into this bootstrapping mechanism. Would that be a fair assessment of the situation? Cheers, L. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Thu Apr 22 18:27:03 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents In-Reply-To: <003c01be8cd3$69dafd20$8cc3fea9@w21tp> Message-ID: I don't think Document (which is an interface) has a constructor. I suggest creating a StringReader, with just your 'stub' wf XML document as the String, and then parsing it to create the Document. This does, of course, imply you need a parser to create a Document - but you need the DOM implementation classes anyway. This is the approach I took in my 'ParserFactory' - which wraps the various Parsers for testing purposes. It's just a convenience method really, but that seemed like a convenient place to put it :-) tim ______________________________ Reply Separator _________________________________ Subject: Re: DOM - Creating Documents Author: donpark (donpark@quake.net) at unix,mime Date: 22/04/99 16:18 Ronald, > 1) Docuverse requires you to pass a string, presumably the root element > type. I assume the reason for this is so that the returned Document will > represent a well-formed document. None of the other implementations appear > to require this. Actually, you are atttributing me with too much cleverness. Docuverse's DOM.createDocument was modeled after the portion of the DOM Level 1 spec that fell off the wagon before release. > 2) Is it really not possible to create a new Document with IBM's classes? > I can't find a method for doing it. Have you tried the constructor? I don't know much about XML4j but DataChannel's so called Most Advanced XML Parser uses constructor to create new Documents. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981 -02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mike at DataChannel.com Thu Apr 22 18:53:05 1999 From: mike at DataChannel.com (Mike Dierken) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents Message-ID: <8EAE75D3D142D211A45200A0C99B60239354FD@ZEUS> When the DataChannel XML Document class loads & parses XML text, it uses one 'input scanner' but allows any 'parser' class to be used. The IXMLInputScanner interface is used to read characters from the source data (input stream, text string, etc.). The IXMLDOMParser interface (based on IXMLTokenizer and IXMLParser) is used to build the DOM representation. The Document class has methods for specifiying which IXMLDOMParser to use. Here is the list: "com.datachannel.xml.tokenizer.parser.BasicParser" (default) "com.datachannel.xml.tokenizer.parser.DTDValidatingParser" (based on BasicParser) "com.datachannel.xml.tokenizer.parser.XMLDOMParser" (based on DTDValidatingParser) Here are the methods in Document for specifying different parsers: public void setParserClassName(String parserClassName) public IXMLDOMParser createParser() You can use call the 'setParserClassName()' on the Document class, or extend the class & override the 'createParser()' class (to initialize the parser with more information, for example). The internal 'load()' method will create the parser, tell it what the document instance is, tell it what the input scanner is and then start the parser on its way. The parser then inserts things it finds into the specified document instance. There is some documentation on the xdev.datachannel.com site: http://xdev.datachannel.com/downloads/xjparser/documentation/#properties Mike D DataChannel -----Original Message----- From: Kay Michael [mailto:Michael.Kay@icl.com] Sent: Thursday, April 22, 1999 8:44 AM To: 'XML-DEV' Subject: RE: DOM - Creating Documents Ronald Bourret asked how do you create a Document using various products. For the record, here are the relevant drivers from SAXON: Datachannel: public Document build (InputSource source) throws java.io.IOException, org.xml.sax.SAXException { com.datachannel.xml.om.Document doc = new com.datachannel.xml.om.Document(); try { if (null != source.getByteStream()) { // byte stream supplied doc.loadFromInputStream(source.getByteStream()); } else if (null != source.getCharacterStream()) { // character stream supplied [horrible code and not tested!] Reader r = source.getCharacterStream(); StringBuffer sb = new StringBuffer(10000); char[] cbuf = new char[10000]; int bytes = 0; while (true) { bytes = r.read(cbuf); if (bytes<0) break; // end of file sb.append(cbuf, 0, bytes); } doc.loadXML(sb.toString()); } else { // URL supplied doc.load(source.getSystemId()); } } catch (Exception e) { throw new SAXException(e); } return doc; } Docuverse: public Document build(InputSource source) { DOM dom = new DOM(); dom.setReader(new SAXONFreedomDriver()); return dom.openDocument(source); } // inner class /** * * SAXONFreedomDriver
* Subclasses DOM-SDK's SAXReader class to use a supplied parser
* This class is used to interface SAXON with Docuverse * and is of no direct concern to applications. * */ private class SAXONFreedomDriver extends com.docuverse.dom.util.SAXReader { protected Parser createParser (com.docuverse.dom.DOM dom) { return givenparser; } } IBM: public Document build (InputSource source) throws java.io.IOException, org.xml.sax.SAXException { Document doc = null; com.ibm.xml.parser.Parser p; try { p = new com.ibm.xml.parser.Parser(source.getSystemId()); if (null != source.getByteStream()) { doc = p.readStream(source.getByteStream()); } else if (null != source.getCharacterStream()) { doc = p.readStream(source.getCharacterStream()); } else { doc = p.readStream( p.getInputStream( source.getSystemId(), null, source.getSystemId())); } } catch (Exception e) { p = null; throw new SAXException(e); } return doc; } ORACLE: public Document build(InputSource source) throws java.io.IOException, org.xml.sax.SAXException { oracle.xml.parser.XMLParser p = new oracle.xml.parser.XMLParser(); p.parse(source); return p.getDocument(); } SUN: public Document build(InputSource source) throws java.io.IOException, org.xml.sax.SAXException { XmlDocumentBuilder b = new XmlDocumentBuilder(); b.setDisableNamespaces(true); givenparser.setDocumentHandler(b); givenparser.parse(source); return b.getDocument(); } Interesting if nothing else for the sheer variety of different ways of achieving the same thing! Note some of these allow you to select your parser first, others don't. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 22 19:04:36 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:28 2004 Subject: DOM - Creating Documents Message-ID: <93CB64052F94D211BC5D0010A80013310EB42B@WWMESS3.172.19.125.2> > But that said, each of the listed methods presuppose the availability > of an existing document, correct? Whereas I was initially concerned > about the inability to generate a completely *new* document... Yes, I was solving a slightly different problem, that of building a DOM document from source XML. But I'm sure you could use the same technique to build a document from scratch. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pawars at netscout.com Thu Apr 22 22:11:47 1999 From: pawars at netscout.com (pawars@netscout.com) Date: Mon Jun 7 17:11:29 2004 Subject: Object Serialization Message-ID: <8525675B.006ECE52.00@nsismtp1.netscout.com> Sorry for newbie questions. I am considering to do some serious implementation for serialization of C/C++ objects using XML. I found that WDDX, XML-RPC, WIDL could be used for doing much of my stuff. Are there any other techniques for doing object serilization ? Are there any standardization effort in this direction ? Thanks, -Sitaram xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 23 00:10:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:29 2004 Subject: ANNOUNCE: xml encoding detector in C Message-ID: <371F9E60.D8E76182@locke.ccil.org> I have written an XML encoding detector function in C. It would be easy to translate it to Java, but I thought that C would be the most useful in different contexts. It uses only Standard C facilities. There is a subroutine called "xmlenc" which accepts a FILE* argument and returns a (static) string representing the encoding. I believe it handles all the cases in Appendix F correctly, including the EBCDIC one. There is also a test-harness main program that can generate some sample files in EBCDIC and 16-bit Unicode (8-bit ASCII-compatible files are easy to find). This part can be stripped out, as indicated by the comments, in order to use the routine in some server program. No copyright, no warranty; I assert the moral right to be known as the author. Download from http://www.ccil.org/~cowan/XML/xmlenc.c . -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Apr 23 00:30:46 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:11:29 2004 Subject: Object Serialization In-Reply-To: <8525675B.006ECE52.00@nsismtp1.netscout.com> Message-ID: <4.2.0.32.19990422152225.00e1c2a0@mail.userland.com> This is not a newbie question.. We have a plan for a new version of the XML-RPC spec that embraces WDDX. We worked it out with Simeon Simeonov, the architect of this stuff at Allaire, a few months ago, but we haven't implemented the code yet. Here's a pointer to the discussion: http://discuss.userland.com/msgReader$2563 I guess the net-net is that you can use either object serialization protocol and have a pretty good chance of being compatible with the other at some point not too far down the road. Dave At 01:08 PM 4/22/99 , you wrote: >Sorry for newbie questions. > >I am considering to do some serious implementation for serialization of >C/C++ objects >using XML. I found that WDDX, XML-RPC, WIDL could be used for doing much of >my stuff. > >Are there any other techniques for doing object serilization ? >Are there any standardization effort in this direction ? > >Thanks, >-Sitaram > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on >CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Apr 23 01:22:53 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:29 2004 Subject: Yet another validity question Message-ID: <8725675B.00804C6F.00@d53mta03h.boulder.ibm.com> I just now, for whatever reason, got around to noticing that the James Clark 'valid' tests imply that you can reference an element in a content model without its ever having been declared, for instance: ]> This seems like it would be fine if you were doing WF tests and just were parsing the DTD because it was there and you need to get through it to find entities and such. But, when checking for validity this seems a bit lax. I didn't see anything in the spec that explicitly says "you must (or don't have to) declare all elements referenced in content models", though I certainly could have missed it like so many other things. But, even if it does not say so explicitly, I would think that it would be a bad thing not to require this. If A is in the content model of some other element B, then an A can occur if a B occurs. If an A can occur, you must check the content of A as well. If A isn't defined, you cannot check its content. Therefore, the DTD would seem valid until you actually used A or B (or some other more removed element that eventually used an A.) It does not seem reasonable to me that it would be deemed desirable that a DTD would work like this, when all it required to make it knowably correct (in this sense anyway) is to confirm that all referenced elements are declared (which I am doing right now.) I also check that any attribute list maps to some element regardless of whether that element was actually used or not, which seems to me to fall into the same ballpark. What am I missing here? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 23 01:57:07 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:29 2004 Subject: Yet another validity question In-Reply-To: roddey@us.ibm.com's message of Thu, 22 Apr 1999 17:21:18 -0600 Message-ID: <2032.199904222356@doyle.cogsci.ed.ac.uk> > I didn't see anything in the spec that explicitly says "you must (or don't > have to) declare all elements referenced in content models", Well you don't generally get rules saying you don't have to do something :-) There is indeed no requirement that elements referred to in content models be declared if they do not appear in the instance. I'm not quite sure why this is; most likely it is to make it easier to parametrise or subset DTDs. For example, if you have two DTDs which are identical except for the presence of one element, you can put that element's declaration in a conditional section without having to change the declarations for all the elements in which it is allowed. And of course, just because something isn't invalid doesn't mean a parser can't warn about it. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 23 05:01:41 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:29 2004 Subject: Yet another validity question In-Reply-To: <8725675B.00804C6F.00@d53mta03h.boulder.ibm.com> from "roddey@us.ibm.com" at Apr 22, 99 05:21:18 pm Message-ID: <199904230306.XAA27247@locke.ccil.org> roddey@us.ibm.com scripsit: > I just now, for whatever reason, got around to noticing that the James > Clark 'valid' tests imply that you can reference an element in a content > model without its ever having been declared, for instance: [snippage] > What am I missing here? Extensibility. By allowing unknown element types in content models, we allow documents to define such element types according to their needs, using either the internal subset or a containing external DTD, and then they will "just work" as content of the existing elements. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Fri Apr 23 08:16:12 1999 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:11:29 2004 Subject: Yet another validity question In-Reply-To: <2032.199904222356@doyle.cogsci.ed.ac.uk> Message-ID: On Fri, 23 Apr 1999, Richard Tobin wrote: > There is indeed no requirement that elements referred to in content > models be declared if they do not appear in the instance. The XML processor is allowed to issue a warning, however -- see 3.2 Element Type Declarations, just before production [45]. [...] At User Option, an XML Processor may issue a warning when a declaration mentions an element type for which no declaration is provided, but this is not an error. This rule was inherited from SGML... One reason it is there is to allow "elephants", but exclusions were later removed from XML: in which HOLDER must now be empty but can have a missing end tag. Another reason is because of large but sloppy DTDs, and another is because modular DTDs as complex as the TEI can be next to impossible to write if you have to get rid of all the elmeents that can't appear. Finally, it saves the parser from having to do a second pass over the DTD, since the warning should not be issued if an element is declared later in the DTD than the first content model in which it appears. Lee -- Liam Quin, GroveWare Inc., Toronto; The barefoot programmer l i a m q u i n at i n t e r l o g dot c o m SGML/XML/Unix/C/Perl Consultant, will work for food (or socks?) http://www.interlog.com/~liamquin/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Apr 23 09:43:19 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:11:29 2004 Subject: DOM - Creating Documents Message-ID: <01BE8D6D.7AE1C280@grappa.ito.tu-darmstadt.de> Leigh Dodds wrote: > But that said, each of the listed methods presuppose the availability > of an existing document, correct? Whereas I was initially concerned > about the inability to generate a completely *new* document The methods Michael Kay listed create documents from a SAX InputSource. The methods I listed are the ways to create a completely new document, which is what you originally requested. Because Docuverse's methods require the root element type, it seems any generic method should accept this as an argument and return a Document. (Missing from my list was how to create a new Document with IBM's xml4j. As Jeff Greif pointed out, you simply use the constructor on TXDocument.) > , but > it has been demonstrated to me that this involves some implementation > specific bootstrapping. And from the later discussions all I gather > the W3C *could* specify is some standard registry/property keys/values > that would feed into this bootstrapping mechanism. Would that > be a fair assessment of the situation? Yes. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Gerhard.Stegmann at ect-munich.com Fri Apr 23 11:20:50 1999 From: Gerhard.Stegmann at ect-munich.com (Gerhard.Stegmann) Date: Mon Jun 7 17:11:29 2004 Subject: hello Message-ID: <01D21F06BF65D011A3ED0040954354125E3A@mail.ect-munich.com> hello everybody ... i am brand new to this list, so could anybody please tell me, what we are talking about in general ? the digest i got mentioned something about a e-commerce app using xml ?? is this it ? :-( thanks a lot ... bye.. Mit freundlichen Gruessen Gerhard Stegmann ----------------------------------------------------------- European Computer Telephony GmbH Tel : +49 89 785805-15 Fax: +49 89 785805-17 http://www.ect-munich.com ----------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Paul.Langer at softwareag.com Fri Apr 23 12:09:43 1999 From: Paul.Langer at softwareag.com (Paul Langer) Date: Mon Jun 7 17:11:29 2004 Subject: ANNOUNCE: xml encoding detector in C Message-ID: <000101be8d71$5e755040$eda1bd9d@pcpl.software-ag.de> At Friday, April 23, 1999 12:11 AM John Cowan wrote: > I have written an XML encoding detector function in C. > [snip] > I believe it handles all the cases in Appendix F correctly, > including the EBCDIC one. One remark on the EBCDIC handling: Your program returns "EBCDIC-CP-US" if it detects EBCDIC without an explicit encoding declaration (see comment: /* better than nothing */). I do not think that this behaviour is "better than nothing". The XML spec says "Parsed entities which are stored in an encoding other than UTF-8 or UTF-16 must begin with a text declaration containing an encoding declaration" (Chapter 4.3.3 Character Encoding in Entities, see http://www.w3.org/TR/REC-xml#charencoding). And if you want to define a default, what makes "EBCDIC-CP-US" more desirable than e.g. "ebcdic-cp-is"? All the best, Paul ----------------------------------------------------------- Paul Langer PL@softwareag.com Software AG Tel. +49-6151-92-1912 Uhlandstr. 12 Fax +49-6151-92-1613 D-64297 Darmstadt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 23 15:45:17 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:29 2004 Subject: Yet another validity question References: Message-ID: <37207951.8E7FD3E4@locke.ccil.org> Liam R. E. Quin wrote: > One reason it is there is to allow "elephants", but exclusions were > later removed from XML: > > in which HOLDER must now be empty but can have a missing end tag. I suppose you mean "can have an explicit end tag", no? After all, ordinary EMPTY elements don't have end tags. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 23 16:12:13 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:29 2004 Subject: ANNOUNCE: xml encoding detector in C References: <000101be8d71$5e755040$eda1bd9d@pcpl.software-ag.de> Message-ID: <37207F6E.EFD52F38@locke.ccil.org> Paul Langer wrote: > One remark on the EBCDIC handling: > > Your program returns "EBCDIC-CP-US" if it detects EBCDIC > without an explicit encoding declaration (see comment: > /* better than nothing */). > > I do not think that this behaviour is "better than nothing". In Java I could throw an error, but C doesn't have exception handling, and I figure a server would rather return something than nothing. The routine is not meant to handle ill-formed XML, and will return one of the other defaults ("UTF-8", "UTF-16-BE", "UTF-16-LE") depending on just what bytes it sees. > And if you want to define a default, what makes "EBCDIC-CP-US" > more desirable than e.g. "ebcdic-cp-is"? Nothing. Thanks for taking the trouble to look at it. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Apr 23 16:42:13 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:29 2004 Subject: ANNOUNCE: xml encoding detector in C Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1800@eukbant101.ericsson.se> > -----Original Message----- > From: John Cowan [SMTP:cowan@locke.ccil.org] > > Paul Langer wrote: > > > One remark on the EBCDIC handling: > > > > Your program returns "EBCDIC-CP-US" if it detects EBCDIC > > without an explicit encoding declaration (see comment: > > /* better than nothing */). > > > > I do not think that this behaviour is "better than nothing". > > In Java I could throw an error, but C doesn't have exception > handling, and I figure a server would rather return something > than nothing. The routine is not meant to handle ill-formed > XML, and will return one of the other defaults ("UTF-8", > "UTF-16-BE", "UTF-16-LE") depending on just what bytes it sees. > If this gets turned into an Apache module it would be better to return DECLINED and let the next mime sniffer module handle it - that's what I do in my Apache::MimeXML. I don't think it's a good idea to make assumptions for invalid XML - just return an error code. Just my 2p Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Fri Apr 23 18:10:03 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:11:29 2004 Subject: XML and Python? In-Reply-To: <371D2F51.8F797E5E@prescod.net> Message-ID: Fielding a question from a colleague : The XML One conference has a seminar/track specifically for Python. Is there a close link here? Is Python particularly suited to XML handling? Thanks tim This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andyclar at us.ibm.com Fri Apr 23 19:12:45 1999 From: andyclar at us.ibm.com (andyclar@us.ibm.com) Date: Mon Jun 7 17:11:29 2004 Subject: DOM - Creating Documents Message-ID: <8725675C.005E65CD.00@d53mta06h.boulder.ibm.com> Please pardon my tardiness -- I only follow the digest form of this list. As other people have already mentioned, it IS possible to build DOM document objects programmatically using the IBM XML4J parsers, version 1 and 2. In version 1, the implementation is com.ibm.xml.parser.TXDocument. This DOM 1.0 implementation has the benefit of additional non-DOM functionality. For example, getting a (hash) digest value for any subtree, namespace support, and content model objects stored under the DocumentType object. This version of the DOM is supported in the new version of the parser for compatibility with the older version.. In version 2, the implementation is com.ibm.xml.dom.DocumentImpl. This is a "vanilla" DOM in the sense that it's primary goal was to fulfill only the basic requirements of the DOM specification. It handles default attribute values by having ElementDefinition (non-DOM) nodes under the DocumentType. Additional features have been delayed in order to see what falls out of the W3C working groups on matters such as how the document grammar is stored in the Document. For both versions, there is no factory method for creating the Document instance. You have to use the constructor. We did add a convenience method to the version 2 DOM parsers that allows you to set the document implementation by class name. When you do this, however, you don't get the performance benefit of lazy evaluation of document nodes. -- Andy Clark * IBM, JTC - Silicon Valley * andyclar@us.ibm.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ruth at jpl.nasa.gov Fri Apr 23 19:32:21 1999 From: Ruth at jpl.nasa.gov (Ruth Bergman) Date: Mon Jun 7 17:11:30 2004 Subject: integration of legacy app in client app In-Reply-To: <199903281507.HAA02264@bmhughes.com> Message-ID: <3.0.3.32.19990423103148.00a04290@pop.jpl.nasa.gov> Hi all, I hope that your expertise and experience can help us solve an application integration problem. Description of the problem: We have existing legacy (existing) applications running under Solaris 2.x. We are writing new client apps under Windows and browsers. We need to call services provided by legacy apps from the new client apps. Want to avoid the CORBA/COM approaches if at all possible. Here is the scenario we are envisioning. Browser-based HTML/JavaScript app is used to submit a request which translates into a "function" call to a remote legacy application written in C and having a fixed C function call API. The browser interacts with a Microsoft Windows NT 4.0 Internet Information Server COM application in the middle tier, which in turn performs an RPC to the remote application. The middle tier application waits for the results and formats them into an HTML result set which is then returned to the browser application in near real-time. Question: Can an XML-based interprocess communication solution help us with this problem? Constraints: We need a low overhead solution (seconds). We expect the actual legacy application performance to be the bottleneck. Of course, we need reliable service. And last, but not least, we need to prototype this solution very quickly before we commit to it. Looking forward to your responses. Thanks in advance. Ruth. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Apr 23 19:46:17 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:30 2004 Subject: XML and Python? References: Message-ID: <3720A5C5.93B5E7AB@prescod.net> Tim.Shaw@wdr.com wrote: > > Fielding a question from a colleague : > > The XML One conference has a seminar/track specifically for Python. > Is there a close link here? Is Python particularly suited to XML > handling? There was a recent thread on this issue in another mailing list: http://www.mulberrytech.com/xsl/xsl-list/archive/msg03689.html http://www.mulberrytech.com/xsl/xsl-list/archive/msg03708.html http://www.mulberrytech.com/xsl/xsl-list/archive/msg03677.html http://www.mulberrytech.com/xsl/xsl-list/archive/msg03710.html http://www.mulberrytech.com/xsl/xsl-list/archive/msg03693.html -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Apr 23 20:14:10 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:30 2004 Subject: XML and Python? In-Reply-To: <3720A5C5.93B5E7AB@prescod.net> References: <3720A5C5.93B5E7AB@prescod.net> Message-ID: <14112.46928.8121.733495@localhost.localdomain> Paul Prescod writes: > Tim.Shaw@wdr.com wrote: > > > > Fielding a question from a colleague : > > > > The XML One conference has a seminar/track specifically for Python. > > Is there a close link here? Is Python particularly suited to XML > > handling? > > There was a recent thread on this issue in another mailing list: [snip] Or, to put it more succinctly, Paul Prescod is particularly suited to XML handling with Python. Actually, any object-oriented language should do a pretty good job of handling XML. The initial push among developers was in Java, but now there's excellent support growing in Python and Perl5 as well. Personally, I use Perl5 and Java, but I do sneak into JPython once and a while so that I have an interpreter to poke around in the Java classes. C++ is still badly under-represented in XML software, but you can always use the C Expat parser. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From milowski at dnai.com Fri Apr 23 22:07:47 1999 From: milowski at dnai.com (Alex Milowski) Date: Mon Jun 7 17:11:30 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available In-Reply-To: <14108.29970.711433.501336@localhost.localdomain> Message-ID: On Tue, 20 Apr 1999, David Megginson wrote: > Eric Prud'hommeaux writes: > > > I'd love to differ with you here. In practice, I can't, but in > > theory... I have this itch to work out and implemnt an XSL parser that > > works as as a SAX stream. Given an XslStream that reads the parsed > > stylesheet from an XslDB and has an output SAX stream $this->{OUTPUT}, > > the notion is something like this: > > > > parser reads "" > > In DSSSL, such a thing was not possible because there were > unpredicatable dependencies -- for example, you might find this near > the front of the document: > > ... > > But you wouldn't know that you had to do something useful with it > until you found this near the end of the document: > > ... > > In the general case, then, a stream-based DSSSL processor would > *still* have to cache the entire document, since it allowed arbitrary > navigation. I don't know if the same applies to XSL -- I'll have to > give the spec a closer look. > Not quite. You could develop a stream-based DSSSL processor given that you do the appropriate analysis of the stylesheet upfront and determine where "caching" would have be put into place. XSL has the same problem except that it is much more clear when things would have to be "cached" and "unrolled" when the select patterns were satisfied. Further, if you have DTD (schema) you can determine when a pattern can apply with the instance and make the stylesheet processor operate much faster. The issue for XSL is that it is still a hard problem to solve and one might question what the benefit would be over a tree-based solution in that development of a fast XSL processor should be possible once the tree is assumed. I did some experimentation with stream-based application of patterns and select statements within templates. It is possible and the result is extremely promising. What struck me was the complexity that resulted because of the combinatorics of the problem as a whole as well as the necessity of a really smart compiler. R. Alexander Milowski milowski@dnai.com Remember: Stressed spelled backwards is desserts. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Apr 23 22:27:25 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:30 2004 Subject: ANNOUNCE: revision of ParserFilter base class Message-ID: <3720D782.6386F3BB@locke.ccil.org> My ParserFilter base class, which is intended to make it easy to write SAX parser filters, has been modified. 1) The string ".useParser", which was formerly appended to the fully qualified class name of the filter to produce the system property naming the class of the enclosing parser, has been changed to just ".parser". This makes it compatible with XAF. 2) There was a deficiency whereby a filter would receive DTD events only if its client had registered to receive them as well. This restriction was unnecessary and has been removed. Download: http://www.ccil.org/~cowan/XML/ParserFilter.java . MDSAX folks, take heed please. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Fri Apr 23 23:59:23 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:30 2004 Subject: ANNOUNCE: SAXON 4.2 - An XSL Compiler Message-ID: <93CB64052F94D211BC5D0010A80013310EB431@WWMESS3.172.19.125.2> SAXON 4.2 is available for download on the usual URL: http://home.iclweb.com/icl2/mhkay/saxon.html The most significant new feature is an XSL compiler. This takes an XSL stylesheet as input and generates a Java application as output. The Java application (with the help of the SAXON run-time library) performs the actual processing of source XML documents without further reference to the original style sheet. The compiled stylesheet can be invoked from the command line, from a client application via an API, or can be installed as a servlet and invoked directly from the browser. A compiled stylesheet runs 2-3 times faster than the interpreter, with a bigger potential saving in servlet mode because it avoids the need to reinitialise for each document served. The compiler is written in XSL, generates Java, and has been used to compile itself. SAXON's XSL interpreter remains available and has been upgraded to conform closely with the Dec 1998 XSL draft (before they moved the goalposts). The Java API also remains available and benefits from support for many of the newly-implemented XSL features, e.g. richer pattern syntax. SAXON's XSL, while lacking some of the features in the standard, has a number of extensions designed to widen the range of applicability. These include: - multiple output files - generate any text output file (not just XML or HTML) - close integration of Java and XSL code - extensibility (e.g. to access SQL databases) A new feature which responds to a request many XSL users have made is a "group-by" operator which allows extra levels to be added to the document structure, e.g. for subheadings. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Apr 24 12:59:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:30 2004 Subject: Streaming XSL Stylesheets - Was: XML::Writer 0.1 available In-Reply-To: References: <14108.29970.711433.501336@localhost.localdomain> Message-ID: <14113.41384.243664.997662@localhost.localdomain> Alex Milowski writes: > Not quite. You could develop a stream-based DSSSL processor given that > you do the appropriate analysis of the stylesheet upfront and determine > where "caching" would have be put into place. In DSSSL, at least, it would be probably unmanageably difficult, at least not in the general case -- the problem is that someone could use character data (perhaps after much arbitrary manipulation) later in the document to decide what to select earlier, and vice-versa, using DSSSL's navigational functions. For example, in DSSSL, I could specify something like the following: a. Take all of the character data of this element b. remove all whitespace c. reverse the order of the characters d. remove every character where its Unicode value mod 7 is 0 e. look up the resulting string in a top-level a-list f. process all elements in a second document that have a 'foo' attribute whose value is lexically <= the a-list value, and concatenate their character data g. process the element in the first document with a 'bar' attribute containing a number equal to the number of characters extracted from the second document in step f. Hopefully, things aren't quite this bad for XSL. Good luck -- I'll be very excited if you can solve this. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Sun Apr 25 00:41:00 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:11:30 2004 Subject: Yet another validity question References: <37207951.8E7FD3E4@locke.ccil.org> Message-ID: <3722485D.F72BCBF1@allette.com.au> John Cowan wrote: > > One reason it is there is to allow "elephants", but exclusions were > > later removed from XML: > > > > in which HOLDER must now be empty but can have a missing end tag. > > I suppose you mean "can have an explicit end tag", no? After all, > ordinary EMPTY elements don't have end tags. I suspect that Liam was using "empty" in the sense that it is an element that cannot contain other elements because its content model doesn't allow it to, rather than one that has the declared content of EMPTY. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Sun Apr 25 06:08:05 1999 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:11:30 2004 Subject: Yet another validity question In-Reply-To: <3722485D.F72BCBF1@allette.com.au> Message-ID: On Sun, 25 Apr 1999, Marcus Carr wrote: > From: Marcus Carr > John Cowan wrote: > > Liam Quin wrote: > > > One reason it is there is to allow "elephants", but exclusions were > > > later removed from XML: > > > > > > in which HOLDER must now be empty but can have a missing end tag. > > > > I suppose you mean "can have an explicit end tag", no? After all, > > ordinary EMPTY elements don't have end tags. > > I suspect that Liam was using "empty" in the sense that it is an element > that cannot contain other elements because its content model doesn't > allow it to, rather than one that has the declared content of EMPTY. That's correct. In SGML (not XML) an EMPTY element cannot have an end tag. If you are dealing with fully normalised data, it really helps if they do have end tags -- the XML /> notation was devised for this purpose. Lee -- Liam Quin, GroveWare Inc., Toronto; The barefoot agitator l i a m q u i n at i n t e r l o g dot c o m Unix/C/SGML/XML/Perl, will manage programmers for socks and food. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 25 14:50:45 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:11:30 2004 Subject: New expat test release Message-ID: <37230E8E.ADD0335F@jclark.com> A new test release of expat is available from: ftp://ftp.jclark.com/pub/test/expat.zip I plan for this to be the last test release before 1.1. This fixes a few bugs and adds a couple of small features. XML_GetSpecifiedAttributeCount() allows you to determine whether attributes were specified or defaulted XML_SetNotStandaloneHandler() allows you to control what to do with documents that have a DTD and standalone="no" XML_GetCurrentByteCount() gives you the number of bytes in the current event James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Apr 25 16:29:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:30 2004 Subject: XML::Writer 0.2, with Namespace support Message-ID: <199904251428.KAA01459@megginson.com> I have just uploaded a new version of the XML::Writer module to the following location: http://www.megginson.com/Software/XML-Writer-0.2.tar.gz (It will also be appearing on CPAN as soon as it gets through the tests.) This version contains some significant new enhancements, including intelligent namespace support; for example, to create a document with Namespaces, you can use something like this: my $rdfns = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; my $dcns = "http://www.purl.org/dc#"; my $davidsurl = "http://home.sprynet.com/sprynet/dmeggins/"; $writer->startTag([$rdfns, 'RDF']); $writer->startTag([$rdfns, 'Description'], 'about' => 'http://www.megginson.com'); $writer->startTag([$dcns, 'Creator'], [$rdfns, 'resource'] => $davidsurl); $writer->endTag(); $writer->endTag(); $writer->endTag(); $writer->end(); The module will generate prefixes automatically; however, if you have certain preferred prefixes (like 'rdf' or 'dc') you can supply them in a map in the constructor: my $rdfns = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; my $dcns = "http://www.purl.org/dc#"; my $writer = new XML::Writer(NAMESPACES => 1, PREFIX_MAP => {$rdfns => 'rdf', $dcns => 'dc'}); There are also new query functions to obtain information about the current context. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Mon Apr 26 06:40:02 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:11:30 2004 Subject: Announcement: XML/SGML Asia Pacifc '99 - Call For Papers Message-ID: <3723EE03.4A2524F6@allette.com.au> This mail has been cross-posted to a number of XML-related groups - my apoplogies to those who receive multiple copies. Marcus ++++++++++++++++++++++++++++++++++++++++++ XML/SGML Asia Pacifc '99 : 18 - 21 October Hotel Mercure, Broadway, Sydney ------------------------------------------ <> Call for Papers <> ------------------------------------------ xmlasia99@allette.com.au http://www.allette.com.au/xmlasia99 Tel (61 2) 9262 4777 Fax (61 2) 9262 4774 ++++++++++++++++++++++++++++++++++++++++++ The Graphic Communications Association, in conjunction with Allette Systems, is pleased to announce XML/SGML Asia Pacific '99 - the region's 6th annual conference for the applications, trends and technologies that support the eXtensible Markup Language (XML) and the Standard Generalised Markup Language (SGML). As usual, the conference will include sessions for managers and users alike but this year we've added an extra stream to specifically address the interests of software developers - a change made in direct response to feedback from last year. We will also be holding informal presentation sessions for individuals and corporate developers to showcase their implementations. <> Call for Papers <> Interested parties are invited to submit proposals for presentation at the conference. Of interest are papers discussing any XML, SGML or related technology issues at the managerial, user or technical expert level. Submissions should include: # A presentation title # The author's name and contact information # A brief presentation abstract # The intended audience (managers, users, technical experts) # The author's biography (approximately 200 words) All abstracts must be submitted in electronic form. The deadline for completed proposals is 30 June, 1999. Full papers must be submitted by 1 September, 1999. Please send submissions to: Craig Kirkwood, email: craig@allette.com.au <> Call for Exhibitors <> Software vendors and their products are critical to the continued growth of XML, SGML and other document management and Web publishing standards. The table-top exhibition at XML/SGML Asia Pacific '99 is a powerful way to present new products and product updates to those with a geninue, professional interest the industry. It is also worth noting that this conference attracts attendees from the main Asian markets such as Korea, Singapore, Malaysia and others. Exhibition space is strictly limited so please book early to avoid disappointment. <> Tutorials <> One-day tutorials will be held prior to the opening of the conference. The curriculum for these tutorials will be available from our Web site later in the year. <> Exhibits <> An exhibition area will complement the conference sessions and provide an opportunity to see the latest products and services from the major participants in the XML and SGML market. The exhibition will be open from the afternoon of the second day until the end of the conference. <> User Group Meetings <> Preceding the conference opening, several key XML and SGML vendors will conduct their annual Asia Pacific User Group meetings. Other user groups are also encouraged to plan meetings concurrently with the conference. These meetings are free and open to everyone - even those not attending the conference. Space, however, is limited so early registration is essential. <> Preliminary Schedule <> XML/SGML Asia Pacific '99 will run from Monday, October 18th through Thursday, October 21st. Monday 18 October: Tutorials Tuesday 19 October: User Group meetings Opening Keynotes addresses Wednesday 20 October: Conference sessions Exhibition opens Conference function Thursday 21 October Conference sessions Exhibitions Closing Keynotes <> Conference fees <> GCA Member: AUD$ 895 US$ 645 non-GCA Member: AUD$ 945 US$ 685 <> Tutorial fees <> GCA Member: AUD$ 365 US$ 265 non-GCA Member: AUD$ 395 US$ 285 <> Combined Conference & Tutorial package <> GCA Member: AUD$ 1195 US$ 865 non-GCA Member: AUD$ 1275 US$ 925 The registration fee includes all conference materials, lunch, the conference reception and entrance to the exhibition. <> Venue <> The XML/SGML Asia Pacific '99 conference will be held at the: Hotel Mercure Sydney Broadway, Sydney. Tel: +61 2 9217 6666 Fax: +61 2 9217 6888 The Hotel Mercure Sydney is a brand new complex located adjacent to Sydney's Central Station and Chinatown district. The hotel boasts excellent conference and business facilities, a fitness centre and swimming pool. Sydney's international airport will also be within easy access by light rail. Further information can be found at http://www.mercure.com. A number of rooms have been reserved at the special conference rate of AUD$128 (approx US$ 85) per night for a standard room or AUD$153 (approx US$ 95 for a deluxe room. Accommodation and reservations are the responsibility of individual delegates. When making your reservation, ensure you mention you are a delegate of the XML/SGML Asia Pacific '99 Conference or you may be charged the full hotel rate. The cut-off for accommodation reservations is 12 September, 1998. After this date availability cannot be guaranteed. Please note: there is a 10% government tax payable on the room rate. For further conference travel assistance please contact Penny at Spencer Travel, Sydney . <> Registration <> Register via the web at http://www.allette.com/xmlasia99 Postal and fax registrations or further information, please contact: Allette Systems (Australia) Level 10, 91 York Street Sydney NSW 2000 Australia Email: xmlasia99@allette.com.au Tel: (61 2) 9262 4777 Fax: (61 2) 9262 4774 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Apr 26 16:19:57 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:30 2004 Subject: ANNOUNCE: Apache XML encoding detector update Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1804@eukbant101.ericsson.se> I've fixed a couple of bugs and added support for EBCDIC and little endian utf-16, and added a couple of configuration parameters to my encoding detector for xml files. Thanks to John Cowan for his C detector I realised I had missed EBCDIC and the other stuff. It's still missing UCS-4 support - how much is that actually used? Anyway, it's on CPAN in all the usual places in my directory authors/id/M/MS/MSERGEANT I don't think it's mirrored to the rest of the world yet, since I only just uploaded it. Also, a note to John Cowan: In reading through your detector I noticed that it only checks for g=["']...["'], not the full encoding=["']...["'] - Personally I don't think that's safe, but having read your C code (and knowing C's poor handling of strings) I can understand why you did it... Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Apr 26 17:07:25 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:30 2004 Subject: ANNOUNCE: Apache XML encoding detector update References: <5F052F2A01FBD11184F00008C7A4A800022A1804@eukbant101.ericsson.se> Message-ID: <37248103.E0DF143@locke.ccil.org> Matthew Sergeant (EML) wrote: > Also, a note to John Cowan: In reading through your detector I noticed that > it only checks for g=["']...["'], not the full encoding=["']...["'] - > Personally I don't think that's safe, but having read your C code (and > knowing C's poor handling of strings) I can understand why you did it... Actually, it *is* safe. Once we see " Actually John, I meant for future versions of XML that included for example: Just a silly random thought... I knew it was safe for xml 1.0, just perhaps not for the future. I'd hate to see old code that breaks down the line through lack of keeping up to date - that's what gave us the y2k problem. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: John Cowan [SMTP:cowan@locke.ccil.org] > Sent: Monday, April 26, 1999 4:07 PM > To: XML Dev > Subject: Re: ANNOUNCE: Apache XML encoding detector update > > Matthew Sergeant (EML) wrote: > > > Also, a note to John Cowan: In reading through your detector I noticed > that > > it only checks for g=["']...["'], not the full encoding=["']...["'] - > > Personally I don't think that's safe, but having read your C code (and > > knowing C's poor handling of strings) I can understand why you did it... > > > Actually, it *is* safe. Once we see " can be any whitespace character, we know beyond doubt that this is > an XML declaration/text declaration. The first "g" appearing in > the declaration is necessarily the last letter of "encoding", unless > XML is extended to a version number that includes the letter "g", > which is most unlikely. The only other way that a "g" can appear is > within the name of the charset itself. > > -- > John Cowan http://www.ccil.org/~cowan cowan@ccil.org > You tollerday donsk? N. You tolkatiff scowegian? Nn. > You spigotty anglease? Nnn. You phonio saxo? Nnnn. > Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david-b at pacbell.net Mon Apr 26 18:59:13 1999 From: david-b at pacbell.net (David Brownell) Date: Mon Jun 7 17:11:30 2004 Subject: DOM - Creating Documents References: Message-ID: <37249B42.C629D27E@pacbell.net> Miles Sabin wrote: > > So, there's no DOM-vendor-independent mechanism for > document creation at the mo' ... That's the real issue. Document creation (either empty, or through connection to an XML processor) is outside scope of DOM ... but is fundamental for applications using DOM. So long as that's an issue, vendors achieve some level of lock-in for DOM apps. Most of them won't suffer for that, and won't worry about addressing it until there's customer demand for it. Another way to look at this: you can't write a DOM test suite until these issues (2 ways to create a "dom.Document") are resolved. (Notation and Entity objects can't be created except implicitly by parsing a DTD, either.) > there _might_ be in > Level 2, but there's a lot of tricky issues that need to > be resolved before that can happen. They're more politically tricky than technically so. In any system, bootstrapping calls for a "step out of bounds". There are widely known solutions. Consider what SAX does for finding parsers, in Java. DOM Level 2 is the right opportunity to fix this. Everyone should require the DOM WG to solve this problem, so that application code can be portable. Next time you get a chance to review a DOM draft, make sure this feedback is received. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nicolas at isotools.com Mon Apr 26 20:29:25 1999 From: nicolas at isotools.com (Nicolas LIPS) Date: Mon Jun 7 17:11:31 2004 Subject: C++ MSXML samples Message-ID: <3724BBBF.256F20AA@isotools.com> Hello, Do you know where can I find many examples about the Microsoft XML DOM interface with C++ ? Thanks. --------------------------------------------------------------- Nicolas LIPS nicolas@isotools.com ISOTOOLS 2065 chemin du pont Rout - 13090 Aix-en-Provence - France Tel: 04.42.95.16.82. Fax: 04.42.95.16.83. --------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Apr 26 20:31:07 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:31 2004 Subject: No subject Message-ID: <8725675F.006592A4.00@d53mta03h.boulder.ibm.com> >Snarfed from the very useful guide.html that came with the IBM xml4j >(version 1.1.9) distribution (the documentation has changed in later >versions, and I haven't looked for >the replacement.): > > >Creating a new XML document is a three step process. First, create the >TXDocument, which is the concrete implementation of Document. Second, create >the various XML objects using the TXDocument's create* methods. Finally, use >appendChild to append these appropriately to the TXDocument or sub-Elements. >Make a TXDocument instance. > >TXDocument doc = new TXDocument(); > Be forwarned that the 'TX' stuff is being deprecated in the version 2.x architecture. That functionality is being implemented in other ways, which keep it cleanly out of the core DOM stuff (for better layering of blessed stuff vs. extended stuff.) There is a 'TX compatibility' layer that provides most of that functionality in the old way, to help people move forward. But don't make use of it in any new code based on our 2.x system. The TX stuff won't be in the C++ version at all. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Apr 26 20:37:26 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:31 2004 Subject: No subject Message-ID: <8725675F.00661F78.00@d53mta03h.boulder.ibm.com> From: John Cowan >> What am I missing here? > >Extensibility. > >By allowing unknown element types in content models, we allow >documents to define such element types according to their needs, >using either the internal subset or a containing external DTD, >and then they will "just work" as content of the existing elements. > But... within that particular usage context, they *will still* be declared :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Apr 26 20:44:19 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:31 2004 Subject: No subject Message-ID: <8725675F.0066C9AF.00@d53mta03h.boulder.ibm.com> >Actually, any object-oriented language should do a pretty good job of >handling XML. The initial push among developers was in Java, but now >there's excellent support growing in Python and Perl5 as well. >Personally, I use Perl5 and Java, but I do sneak into JPython once and >a while so that I have an interpreter to poke around in the Java >classes. > >C++ is still badly under-represented in XML software, but you can >always use the C Expat parser. > I can't talk too much about this, since I got spanked for it earlier. However, you might want to look on Alphaworks in oh say 24 to 48 hours. Check the section for new stuff. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Casey at echostar.com Mon Apr 26 21:39:43 1999 From: Mark.Casey at echostar.com (Casey, Mark) Date: Mon Jun 7 17:11:31 2004 Subject: top down vs. bottom up Message-ID: <8E7905420FB9D211916A00609773FB0E01A7554D@exchange1.echostar.com> Hello to all, thanks for this great aid to our efforts here. I enjoy all the conversations, they are very helpful and polite (most of the time, as it should be). My colleague and I have different viewpoints on how to write code that executes string manipulation rules on large alphanumeric input fields. For instance, we wish to produce a string that is the concatenation of a variable (input) string, concatenated to the results of yet another operation that concatenates two other strings (result(A+(B+(C))). This is a general example of quite complex rule sets that will be applied to varying inputs (all strings). The main difference between the two examples is that the first one surrounds a nested operation with and tags, while the second does not. Thanks in advance for your time and trouble! (please ignore the coding, etc rules as I'm mixing C++ and XML for brevity): wasting TAG="BBB" wstring XY="XXXYYY" wstring XY="AAABBB" (ignore format of above input variables) ***************Example A. ************************* RESULT = TAG XY AB OR RESULT = TAG XY AB ***********END OF EXAMPLE******************** - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Mark Casey - Sr Engineer NagraStar LLC - an advanced technology joint venture of http://www.NagraVision.com and http://www.Echostar.com http://www.DishNetwork.com 90 Inverness Circle East, Englewood, CO USA 80112 303-706-5710 voice w/mail 303-706-5719 fax w/paper casey@nagrastar.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "ESCHEW OBSFUCATION!" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Apr 26 21:57:38 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:31 2004 Subject: New expat test release In-Reply-To: <37230E8E.ADD0335F@jclark.com> Message-ID: <199904261957.PAA16518@hesketh.net> At 07:46 PM 4/25/99 +0700, James Clark wrote: >XML_SetNotStandaloneHandler() allows you to control what to do with >documents that have a DTD and standalone="no" What exactly are we supposed to hand this method and what exactly is it supposed to do? Is this for passing off non-standalone documents to a different processor, or is this part of a mechanism we might be able to use to make expat deal with external resources? xmlparser.h just has: > > void XMLPARSEAPIXML_SetNotStandaloneHandler(XML_Parser > parser, XML_NotStandaloneHandler handler); Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Apr 27 05:34:07 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:11:31 2004 Subject: New expat test release References: <199904261957.PAA16518@hesketh.net> Message-ID: <3725280A.B703A865@jclark.com> "Simon St.Laurent" wrote: > > At 07:46 PM 4/25/99 +0700, James Clark wrote: > >XML_SetNotStandaloneHandler() allows you to control what to do with > >documents that have a DTD and standalone="no" > > What exactly are we supposed to hand this method and what exactly is it > supposed to do? Is this for passing off non-standalone documents to a > different processor, or is this part of a mechanism we might be able to use to > make expat deal with external resources? > > xmlparser.h just has: > > > > void XMLPARSEAPIXML_SetNotStandaloneHandler(XML_Parser > > parser, XML_NotStandaloneHandler handler); It also has: /* This is called if the document is not standalone (it has an external subset or a reference to a parameter entity, but does not have standalone="yes"). If this handler returns 0, then processing will not continue, and the parser will return a XML_ERROR_NOT_STANDALONE error. */ typedef int (*XML_NotStandaloneHandler)(void *userData); It doesn't allow expat to handle external DTDs. Typically you would use this to stop processing with an error, to continue processing probably with a warning, or to invoke a different processor. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Apr 27 12:43:14 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:11:31 2004 Subject: Apache Charset sniffer Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1814@eukbant101.ericsson.se> I was reading the XML spec on the way to work today (sometimes traffic jams combined with a PalmV are a real blessing :)) and realised that my Apache XML charset sniffer might be in error. By default it returns application/xml as the mime type, but it returns it as: application/xml; charset=xyz Which now, according to everything I know about http and now the XML spec seems wrong. If I'm returning application/xml I shouldn't return a charset at all - that's up to the end application. I should only return a charset if the user wants to return text/xml. Is that the consensus on this list? Does returning a charset for application/xml do any harm? Does it do any good? Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ksall at cen.com Tue Apr 27 15:10:28 1999 From: ksall at cen.com (Sall, Ken) Date: Mon Jun 7 17:11:31 2004 Subject: IE5 and Amaya: Different Entities Results Message-ID: <0C2275F991F7D2119DC700A0C96F64B6015CCA@CEN1> Forgive me if this has been covered in one of the many discussions about entities. However, I'm trying to use the GCA Paper XML DTD for Markup Technologies 99 with IE5 (3/18/99) and also Amaya. The DTD http://www.mulberrytech.com/MT99/mtxmldtd.html references several entities files. My test file (below) tries to use entities from 3 sets, Greek, Technical, and Pub; this produces some curious results: Greek Technical Pub IE5 ok fails fails Amaya ok ok fails Is it that these browsers use built-in entities and completely ignore external entities? -Ken Sall WDVL and Century Computing Division of AppNet, Inc. ---------------------------------------------------------------- Ken Sall
Century Computing Division of AppNet, Inc.

Testing Entities Greek tests: Here is lambda: λ and capital Lambda: Λ and mu: μ and rho: ρ and delta: δ and psi: ψ.
Technical tests: forall: ∀ and isin (set membership): ∈ and rArr (implies): ⇒
Pub tests: dagger: † and star: ☆ and frac25: ⅖ and male: ♂ and female: ♀
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Paul.Langer at softwareag.com Tue Apr 27 16:48:59 1999 From: Paul.Langer at softwareag.com (Paul Langer) Date: Mon Jun 7 17:11:31 2004 Subject: Apache Charset sniffer Message-ID: <001401be90b3$2b90d3d0$eda1bd9d@pcpl.software-ag.de> At Tuesday, April 27, 1999 12:53 PM Matthew Sergeant wrote: > [snip] > If I'm returning application/xml I shouldn't return a charset >at all - that's up to the end application. I should only return a charset if >the user wants to return text/xml. > >Is that the consensus on this list? No. Please see "XML Media Types" (RFC 2376, http://www.imc.org/rfc2376) and the related mailing list "ietf-xml-mime" (http://www.imc.org/ietf-xml-mime/). RFC 2376 says: "3.2 Application/xml Registration MIME media type name: application MIME subtype name: xml Mandatory parameters: none Optional parameters: charset Although listed as an optional parameter, the use of the charset parameter is STRONGLY RECOMMENDED, ..." I think "application/xml" and "text/xml" should be handled the same whereever possible. On the logical level, XML documents are always text, since they contain nothing but characters. "application/xml" is only necessary since "text/..." implies restrictions on the used encoding (not relevant for 8bit clean protocolls like HTTP). See the discussion list mentioned above for other views on the purpose of XML media types (dispatching). All the best, Paul ----------------------------------------------------------- Paul Langer PL@softwareag.com Software AG Tel. +49-6151-92-1912 Uhlandstr. 12 Fax +49-6151-92-1613 D-64297 Darmstadt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Tue Apr 27 18:33:07 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:31 2004 Subject: intercepting internal entities? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2551E6@MDYNYCMSX1> Here's my problem: let's say I have a document with ä in it somewhere and auml properly declared as an internal entity in the DTD along with the other 589 internal entities shown at http://www.oasis-open.org/cover/xml-ISOents.txt. My application will write to one file headed for a Windows app (in which case I want the ä mapped to byte 228), a Mac document (where I want it mapped to byte 138) a DOS one (byte 132), a web page ("ä") and a Bloomberg terminal ("a"). In SGML, we would declare auml as an SDATA entity and the conversion tools would let me trap the internal entity event and map it appropriately depending on the output format. Using Java and SAX, I don't see any equivalent event to trap. SAX's EntityResolve interface seems restricted to external entities. I suppose I could scan all characters before outputting them and map the ones that need it, but if there are potentially 590 that need mapping, this seems inefficient. Is there a better way? Am I missing something? thanks, Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tor at medicalpriority.com Tue Apr 27 18:36:51 1999 From: Tor at medicalpriority.com (Tor Langlo) Date: Mon Jun 7 17:11:31 2004 Subject: How do you combine two XML documents and an XSL stylesheet? Message-ID: <11476D10B5BFD211AF3A00E0290C8C510B1F42@MPC_3> I am managing a emergency medical protocol (for 911) and am looking at the possibility of using XML as the native storage format for the protocol. This protocol is available in several languages so we have separated the "structure" of the protocol from the text. Regards, Tor xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Apr 27 18:59:43 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:11:31 2004 Subject: Two new bits Message-ID: <199904271659.MAA22536@hesketh.net> I've just posted two new pieces to my site. The first is a final draft of "XML, Integration and the Smaller Developer", an article looking at what XML has to offer small developers and systems integrators. http://www.simonstl.com/articles/xmlsmall.htm The second is a set of slides I presented at the New York Object Developer's Group (http://www.objdev.org) last Monday. Called "Java, XML, and a New World of Open Components", it looks at a number of key technologies for using XML in object-oriented development. For right now, it's available in PowerPoint and the badly bastardized HTML PowerPoint exports. When I have time, I hope to clean it up. http://www.simonstl.com/articles/nycod/index.htm Simon St.Laurent XML: A Primer Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Tue Apr 27 20:23:23 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:31 2004 Subject: Announcing XML4C2, finally :-) Message-ID: <87256760.0064CF20.00@d53mta03h.boulder.ibm.com> Ok, now I can actually talk about it... The new XML4C2 XML parser is now up on Alphaworks. This is its first public viewing, so bear that in mind; however, its actually quite far along since it draws on a good bit of previous experience. It is a totally optimized for C++ implementation which just happens to quite closely mimic the public APIs of the Java version (the 2.x version I mean), making it pretty easy to move your exprience between the two. It has SAX and DOM APIs, as well as an internal event API out of the scanner which can be used as well if you really need very loseless information. It handles lots of encodings, using IBM's ICU subsystem. Like the Java version, flexibility and scalability have been favored over ultra-blazing speed. But its still a quite reasonable performer. Particuarly in the e-bidness area it is far quicker for fast up/parser/down cycles needed by DB stored procedures, and other servery oriented processing of small transaction type documents. This version only has support for file:// URLs, with either no host or 'localhost'. That will be fixed in an upcoming version. Right now it has pretty much the same license as the Java version and the source code is not available; however, keep an eye out over the next weeks for a potentially important change in that area. So please check it out and provide comments to the address provided in the docs. We know that there are some problems, but we still want to hear any feedback since its always possible that we don't about some particular issue. Its at: www.alphaworks.ibm.com and should be in the "new stuff" area for a while since its just arrived there. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Apr 27 22:21:34 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:31 2004 Subject: intercepting internal entities? In-Reply-To: <84285D7CF8E9D2119B1100805FD40F9F2551E6@MDYNYCMSX1> References: <84285D7CF8E9D2119B1100805FD40F9F2551E6@MDYNYCMSX1> Message-ID: <14117.63479.233396.47281@localhost.localdomain> DuCharme, Robert writes: > Here's my problem: let's say I have a document with ä in it > somewhere and auml properly declared as an internal entity in the > DTD along with the other 589 internal entities shown at > http://www.oasis-open.org/cover/xml-ISOents.txt. My application > will write to one file headed for a Windows app (in which case I > want the ä mapped to byte 228), a Mac document (where I want > it mapped to byte 138) a DOS one (byte 132), a web page ("ä") > and a Bloomberg terminal ("a"). This is a problem that came up during my discussion with Rick Jelliffe (at, I think it was Rick) about the utility of maintaining XML and SGML as two separate standards. Rick rightly pointed out that this is one area where SGML can make life a little harder. Actually, for this example, XML is sufficient -- since all of the characters are represented in Unicode, simply declare the following: Your processing software will receive the character in UTF-16 if you're using Java (or Python?) and in UTF-8 if you're using Perl. When you write your output, simple re-encode the file into the appropriate encoding: Java and Perl both have support for this, and Python probably does as well. The area where SDATA actually does give an advantage is when you're using characters that are not part of Unicode: my favourite example is differentiating six different graphs of the grapheme /a/ in a tenth-century English manuscript. In this case, you have to use PUA Unicode characters. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From l-arcini at uniandes.edu.co Wed Apr 28 01:13:39 1999 From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.) Date: Mon Jun 7 17:11:31 2004 Subject: ignore/include Message-ID: <37264615.665C76AA@uniandes.edu.co> Hi everyone. While I haven't seen much of the include/ignore keywords in real life, it seems to me that they are a powerful resource, and I was wondering if there is some sort of standard preprocessing that can be applied to them. Let me expand this. Pre processor directives (when using languages like C++) are sometimes regarded as ways of expresing "versions" of the code since they may actually change much of it depending on various factors like previously defined constants (you know, #ifdef ...). I was wondering... is there some way to actually pre-process the DTD declaration or bind an include/ignore statement to a condition so we can have a similar behavior for DTDs?. Fabio xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lbchua at yahoo.com Wed Apr 28 03:02:29 1999 From: lbchua at yahoo.com (Ben Chua Leong Boon) Date: Mon Jun 7 17:11:32 2004 Subject: Modifying XML document using XSL Message-ID: <19990428011002.4248.rocketmail@web307.yahoomail.com> Hi all, I am just beginning to explore on XSL. Based on what I understand so far, XSL is a specification on how a XML document can be formatted and displayed on a browser. My question is: using XSL, is it possible to perform modification to a XML document based on some inputs through the browser? For example, through a xsl and xml document, some text fields are displayed on the browser. When a user fill in some text in the text field, and click a save button, the entered text will be updated to the xml document. Is this possible? Need advice... Thanks in advance. Regards, Ben Chua _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From aroldan at homeloan.com Wed Apr 28 15:51:31 1999 From: aroldan at homeloan.com (Roldan, Alex) Date: Mon Jun 7 17:11:32 2004 Subject: C++ XML Parsers Message-ID: <1EA62F044EA5D111919500805F6FB03D01E25BF9@hslnt98jax.homeloan.com> We are currently searching for a C++ Validating XML parser. We are currently evaluating the SP parser developed by James Clark. This parser contains a Generic and Native API. Unfortunately, there is no documentation for the Native API. Has anyone out there experimented using the Native API, and if so, what has been your experience. If anyone has any feedback on SP or suggestion for other C++ parsers( Not too many out there ) please let me know. Thanks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sjs at portal.com Wed Apr 28 16:22:42 1999 From: sjs at portal.com (Steve Schow) Date: Mon Jun 7 17:11:32 2004 Subject: C++ XML Parsers References: <1EA62F044EA5D111919500805F6FB03D01E25BF9@hslnt98jax.homeloan.com> Message-ID: <37286AB4.D420D851@portal.com> What's the difference between SP and expat? james Clark is the author of both. ?? -steve "Roldan, Alex" wrote: > We are currently searching for a C++ Validating XML parser. We are > currently evaluating the SP parser developed by James Clark. This parser > contains a Generic and Native API. Unfortunately, there is no documentation > for the Native API. Has anyone out there experimented using the Native API, > and if so, what has been your experience. If anyone has any feedback on SP > or suggestion for other C++ parsers( Not too many out there ) please let me > know. > > Thanks > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- ----------------------------- Steve Schow - Portal Software sjs@portal.com http://www.bstage.com/ ----------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sean at westcar.com Wed Apr 28 16:28:25 1999 From: sean at westcar.com (Sean Brown) Date: Mon Jun 7 17:11:32 2004 Subject: C++ XML Parsers In-Reply-To: <1EA62F044EA5D111919500805F6FB03D01E25BF9@hslnt98jax.homeloan.com> Message-ID: On Wed, 28 Apr 1999, Roldan, Alex wrote: ::or suggestion for other C++ parsers( Not too many out there ) please let me ::know. Check alphaworks.ibm.com for their new xml4c2 C++ parser. Just came out yesterday... -S /*/-------------------------------------------------------------------------/*/ | Sean Brown | "the highway is made out of lime jello | | Westcar Consulting Group | and my honda is a barbequed oyster!" | | | | | mailto:sean@westcar.com | http://www.westcar.com/sbrown/eCard.html | /*/-------------------------------------------------------------------------/*/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Clark.Cooper at corporate.ge.com Wed Apr 28 16:39:53 1999 From: Clark.Cooper at corporate.ge.com (Cooper, Clark (CORP, Consultant)) Date: Mon Jun 7 17:11:32 2004 Subject: Version 2.23 of Perl module XML::Parser released Message-ID: <014CB98EB81ED011B3E900805FE2D47A04F74B92@X01SCHCORPGE> I've uploaded version 2.23 of XML::Parser to CPAN. The specific URL is: http://www.perl.com/CPAN/modules/by-module/XML/ XML::Parser is a non-validating XML parser based on James Clark's expat library. This version has some bug fixes, performance enhancements, a new version of expat, and a new Expat method. *** PLEASE NOTE *** The is_defaulted method introduced in version 2.20 has been inactivated. The functionality it provided is now available in the specified_attr method of XML::Parser::Expat: specified_attr() When the start handler receives lists of attributes and values, the non-defaulted (i.e. explicitly specified) attributes occur in the list first. This method returns the number of specified items in the list (of the most recent call of the start tag handler.) So if this number is equal to the length of the list, there were no defaulted values. Otherwise the number points to the index of the first defaulted attribute name. Here's the most recent part of the Changes file: 2.23 Mon Apr 26 21:30:28 EDT 1999 - Fixed a bug in the ExpatNB class reported by Gabe Beged-Dov . The ErrorMessage attribute wasn't being initialized for ExpatNB. This should have been done in the Expat constructor. - Applied patch provided by Nathan Kurz to fix more perl stack manipulation errors in Expat.xs. - Applied another patch by Nathan to change perl_call_sv flag from G_DISCARD to G_VOID for callbacks, which helps performance. - Murata Makoto reported a problem on Win32 platforms that only showed up when UTF-16 was being used. The needed call to binmode was added to the parsefile methods. - Added documentation for release method that was added in release 2.20 to Expat pod. (Point raised by ) - Now using Version 19990425 of expat. No local patches. - Added specified_attr method and made ineffective the is_defaulted method. Clark Cooper Logic Technologies,Inc cccooper@ltionline.com (518) 388-7451 650 Franklin St., Suite 304 coopercc@netheaven.com Schenectady, NY 12305 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Wed Apr 28 18:08:50 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:32 2004 Subject: Patterns and location paths Message-ID: <199904281608.RAA18094@stevenson.cogsci.ed.ac.uk> Looking at the latest XSL draft (http://www.w3.org/TR/1999/WD-xslt-19990421.html) it appears that foo//bar[5] means different things depending on where it appears. As a location path (see section 6.1), it selects a bar element that is a descendent of a foo child of the current node, and that is the fifth such element in document order. As a pattern (see section 6.3), it matches a bar element that is a descendent of a foo element, and that is the fifth foo child of its parent. That is, in one case it means fifth amongst all the matching nodes, and in the other it means fifth amongst its siblings. Have I misunderstood something? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Apr 28 18:11:14 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:32 2004 Subject: C++ XML Parsers References: <1EA62F044EA5D111919500805F6FB03D01E25BF9@hslnt98jax.homeloan.com> <37286AB4.D420D851@portal.com> Message-ID: <37272AB8.FC550F4F@prescod.net> Steve Schow wrote: > > What's the difference between SP and expat? james Clark is the author of both. SP is validating and supports XML beyond SGML. Expat is non validating and faster. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Microsoft spokesman Ian Hatton admits that the Linux system would have performed better had it been tuned." "Future press releases on the issue will clearly state that the research was sponsored by Microsoft." http://www.itweb.co.za/sections/enterprise/1999/9904221410.asp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Apr 28 23:36:31 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:32 2004 Subject: XMLNews e-mail links Message-ID: <14119.32510.557170.921875@localhost.localdomain> I'm happy to report that e-mail to XMLNews.org is finally working. If anyone has sent messages and had them bounce, please resend. For further information, please see the contact page: http://xmlnews.org/contact.html Thanks for your patience, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jiangcm at tc590.csc.neu.edu.cn Thu Apr 29 05:02:13 1999 From: jiangcm at tc590.csc.neu.edu.cn (jiangcm) Date: Mon Jun 7 17:11:32 2004 Subject: EJB and XML/XSL Message-ID: <9904290102.AA1239572@tc590.csc.neu.edu.cn> Hi all: I'm developing a e-commerce project.And I'd like to use EJB at server side.Retrieve the data from DBMS and generate XML documentand use XSL to translate XML to Browser readable Document.And I'd like to use servlet for client-server interaction. But it's only a gross idea now.Anyone has good suggests and great idea?I'd like to you devoutly!Thanks advance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Apr 29 05:51:07 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:11:32 2004 Subject: Modifying XML document using XSL References: <19990428011002.4248.rocketmail@web307.yahoomail.com> Message-ID: <3727D37E.671FC248@prescod.net> Ben Chua Leong Boon wrote: > > Hi all, > > I am just beginning to explore on XSL. Based on > what I understand so far, XSL is a specification on > how a XML document can be formatted and displayed on a > browser. > > My question is: using XSL, is it possible to > perform modification to a XML document based on some > inputs through the browser? No, the XSL specification has no such feature. You could use JavaScript and the DOM to do it, however. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Microsoft spokesman Ian Hatton admits that the Linux system would have performed better had it been tuned." "Future press releases on the issue will clearly state that the research was sponsored by Microsoft." http://www.itweb.co.za/sections/enterprise/1999/9904221410.asp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Apr 29 10:38:50 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:32 2004 Subject: Apache Charset sniffer Message-ID: <001f01be9213$aa7598a0$14f96d8c@NT.JELLIFFE.COM.AU> From: Paul Langer I think "application/xml" and "text/xml" should be handled the same whereever > possible. I disagree: application/xml should provide end-to-end integrity, text/xml should allow point-to-point conversions (newlines, transcoding-to-supersets). Matthew should follow the RFC on everything it requires. If it doesn't require the use of charset on application/xml, and if Matthew thinks it is bad to do so (as I do), let him omit it. Rick xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From om at lgsi.co.in Thu Apr 29 10:54:01 1999 From: om at lgsi.co.in (Om Band) Date: Mon Jun 7 17:11:32 2004 Subject: XLinks & URI ??? Message-ID: <005101be921d$a7b338c0$3601a8c0@lgsi.co.in> Hi all, I have tried xlinks in XML but it did not worked for me. Does IE5 supports xlinks ? If not what are the other browsers that support it ? (Or does require any declaration ?) The code I used was...... to buy the goods of your wish. One other thing What is this 'URI' & in what sence does it differ from URL ? THANKS ! Regds....Om -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990429/bba318c3/attachment.htm From ricko at allette.com.au Thu Apr 29 11:00:51 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:11:32 2004 Subject: ignore/include Message-ID: <002b01be9216$bd95f5d0$14f96d8c@NT.JELLIFFE.COM.AU> From: Fabio Arciniegas A. >I was wondering... is there some way to actually pre-process the DTD >declaration or bind an include/ignore statement to a condition so we can >have a similar behavior for DTDs?. The keywords INCLUDE and IGNORE can be sourced from a parameter entity. Parameter entities can be sourced from a WWW resource by specifying a SYSTEM indentifier, a URL. The URL could be : * a static text/plain file with just the appropriate keyword in it (this file could be sent as part of the same multipart MIME transmission, or sit on a web server somewhere); * a CGI script which generates the keywords (i.e., a version control program or whatever; it could even be a URL which queries the process which has requested the XML document). The ISO SGML committee looked at adding more cpp functionality to the conditional section mechanism. But they were not so enthusiastic, because three other technologies seemed more promising: * adopting more class-based systems, * adding some kind of name scoping to DTDs or DTD entities, and * using attribute and link names to allow validation of other kinds of structures that then simple element tree (either by using element names or values instead of the element name, or by traversing some kinds of links, for example to add stronger typing to IDREFs: this is the kind of thing that is done by "architectural validation"). Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Apr 29 12:59:23 1999 From: Michael.Kay at icl.com (Kay Michael) Date: Mon Jun 7 17:11:32 2004 Subject: Patterns and location paths Message-ID: <93CB64052F94D211BC5D0010A80013310EB44F@WWMESS3.172.19.125.2> > Richard Tobin [mailto:richard@cogsci.ed.ac.uk] wrote > it appears that > > foo//bar[5] > > means different things depending on where it appears. > > As a location path (see section 6.1), it selects a bar element that is > a descendent of a foo child of the current node, and that is the fifth > such element in document order. > > As a pattern (see section 6.3), it matches a bar element that is a > descendent of a foo element, and that is the fifth foo child of its > parent. > Actually, as a pattern it seems to mean different things depending which explanation you read. Arguably the sentence below production [52] "A pattern is defined to match a node..." is definitive, in which case by definition the pattern means the same thing in both contexts. If this is true, however, the "it is easy to understand" explanation that follows appears to be incorrect. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Thu Apr 29 13:44:43 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:11:32 2004 Subject: Dinky little XML and XSL GIFs for web pages? Message-ID: <3728460B.8DBDF0C4@iol.ie> Folks, As you know, there are little buttons you can get for web sites that say things like "Made with Cascading Style Sheets" or "W3C HTML 4.0 compliant" and which are links to the definitions of the standards on the W3C site. I'm doing a couple of web sites with XML and XSL at the moment, and I'd like to put on them some dinky little GIFs that say "This site written in XML" and "This site created using XSL". I think that something like this could be good for XML and XSL (as long as the web sites are good ;-) I tried my hand at creating some, but I'm genetically deficient at graphic design. Is this a good idea? Or a bad one? Alan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mike at DataChannel.com Thu Apr 29 16:25:35 1999 From: mike at DataChannel.com (Mike Dierken) Date: Mon Jun 7 17:11:32 2004 Subject: Dinky little XML and XSL GIFs for web pages? Message-ID: <8EAE75D3D142D211A45200A0C99B602393570C@ZEUS> How about 'Powered by XML'? -----Original Message----- From: Alan Kennedy [mailto:alank@iol.ie] Sent: Thursday, April 29, 1999 4:44 AM To: xsl-list@mulberrytech.com; xml-dev@ic.ac.uk Subject: Dinky little XML and XSL GIFs for web pages? Folks, As you know, there are little buttons you can get for web sites that say things like "Made with Cascading Style Sheets" or "W3C HTML 4.0 compliant" and which are links to the definitions of the standards on the W3C site. I'm doing a couple of web sites with XML and XSL at the moment, and I'd like to put on them some dinky little GIFs that say "This site written in XML" and "This site created using XSL". I think that something like this could be good for XML and XSL (as long as the web sites are good ;-) I tried my hand at creating some, but I'm genetically deficient at graphic design. Is this a good idea? Or a bad one? Alan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Thu Apr 29 17:43:36 1999 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:11:33 2004 Subject: Dinky little XML and XSL GIFs for web pages? References: <8EAE75D3D142D211A45200A0C99B602393570C@ZEUS> Message-ID: <3728822A.407C142@finetuning.com> i agree there should be an icon. but is should be TEXT-BASED! :-) let's keep things clean, shall we? lisa Mike Dierken wrote: > > How about 'Powered by XML'? > > -----Original Message----- > From: Alan Kennedy [mailto:alank@iol.ie] > Sent: Thursday, April 29, 1999 4:44 AM > To: xsl-list@mulberrytech.com; xml-dev@ic.ac.uk > Subject: Dinky little XML and XSL GIFs for web pages? > > Folks, > > As you know, there are little buttons you can get for web sites > that say things like > > "Made with Cascading Style Sheets" > > or > > "W3C HTML 4.0 compliant" > > and which are links to the definitions of the standards on the > W3C site. > > I'm doing a couple of web sites with XML and XSL at the moment, > and I'd like to put on them some dinky little GIFs that say > > "This site written in XML" > > and > > "This site created using XSL". > > I think that something like this could be good for XML and XSL > (as long as the web sites are good ;-) > > I tried my hand at creating some, but I'm genetically deficient > at graphic design. > > Is this a good idea? Or a bad one? > > Alan. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN > 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Apr 29 18:21:32 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:33 2004 Subject: Apache Charset sniffer References: <5F052F2A01FBD11184F00008C7A4A800022A1814@eukbant101.ericsson.se> Message-ID: <372886DF.E083EDCE@locke.ccil.org> Matthew Sergeant (EML) wrote: > If I'm returning application/xml I shouldn't return a charset > at all - that's up to the end application. You *can* return a charset; if present, it is authoritative. If absent, the information provided by the XML declaration is authoritative. Thus saith RFC 2376. > Does returning a charset for application/xml do any harm? If it lies, it's bad. > Does it do any > good? Perhaps, but I doubt it. > Matt. > -- > http://come.to/fastnet > Perl on Win32, PerlScript, ASP, Database, XML > GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V > !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Apr 29 18:22:53 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:11:33 2004 Subject: IE5 and Amaya: Different Entities Results References: <0C2275F991F7D2119DC700A0C96F64B6015CCA@CEN1> Message-ID: <37288732.628AE4BF@locke.ccil.org> Sall, Ken wrote: > Is it that these browsers use built-in entities and completely > ignore external entities? This is certainly true of IE; I can't speak for Amaya, but I'd bet on it. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Thu Apr 29 19:52:50 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:11:33 2004 Subject: C++ XML Parsers Message-ID: <87256762.0062194C.00@d53mta03h.boulder.ibm.com> >We are currently searching for a C++ Validating XML parser. We are >currently evaluating the SP parser developed by James Clark. This parser >contains a Generic and Native API. Unfortunately, there is no documentation >for the Native API. Has anyone out there experimented using the Native API, >and if so, what has been your experience. If anyone has any feedback on SP >or suggestion for other C++ parsers( Not too many out there ) please let me >know. > In case someone hasn't already answered this... I recently bore a child (sorry no video :-) called XML4C2, which is now available in its first public viewing on Alphaworks. www.alphworks.ibm.com It is a fully C++ optimized design that shares the same basic public interfaces as our XML4J2 Java XML parser. It should hopefully fullfil your needs, though we'd certainly like to hear from your if it does not in some way. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Apr 29 19:59:01 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:33 2004 Subject: intercepting internal entities? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2551F9@MDYNYCMSX1> >When you write your output, simple re-encode the file into the >appropriate encoding I figured it out in Java, using the list of encodings at http://www.javasoft.com:80/products/jdk/1.1/docs/guide/intl/encoding.doc .html. It's so cool, all the mapping work is already done! Well, almost all. I couldn't find a mapping that let me output to 7-bit ASCII (e.g. map auml, agrave, aacute, and acirc all to "a") but I can see from the AnselInputStreamReader class that comes with Mike Kay's GedML package (which demos his SAXON library) how to derive my own StreamReader with my own mapping table. Look out umlauts. Thanks David, and Mike, Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gary.struthers at ia-us.com Thu Apr 29 20:35:37 1999 From: gary.struthers at ia-us.com (Gary Struthers) Date: Mon Jun 7 17:11:33 2004 Subject: RPC to XML tools & patterns Message-ID: <3728A6CB.29C3AC42@ia-us.com> We are looking for an efficient, scalable way to load data from legacy archives accessed by RPC into a JFC applet. Generating XML from the RPC results is attractive. Are there already tools for this or must I write an XML generator from scratch? Are there design patterns? Thanks Gary Struthers xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SMUENCH at us.oracle.com Thu Apr 29 21:03:55 1999 From: SMUENCH at us.oracle.com (Steve Muench) Date: Mon Jun 7 17:11:33 2004 Subject: RPC to XML tools & patterns Message-ID: <199904291902.MAA10711@mailsun2.us.oracle.com> Gary, there are things out there like: xml-rpc http://www.xmlrpc.com wddx http://www.wddx.org xp http://www.thinlink.com/xp which may be of interest to you. ____________________________________________________________ Steve Muench, Consulting Product Manager & XML Evangelist Java Business Objects Dev't Team - http://www.oracle.com/xml -------------- next part -------------- An embedded message was scrubbed... From: Gary Struthers Subject: RPC to XML tools & patterns Date: 29 Apr 99 11:36:59 Size: 2788 Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990429/58818cb3/attachment.eml From jgarrett at navix.net Thu Apr 29 21:04:48 1999 From: jgarrett at navix.net (Jim Garrett) Date: Mon Jun 7 17:11:34 2004 Subject: Dinky Buttons - attached gif proposals Message-ID: <000601be926e$4c107f40$58c8c8c8@jgp400> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: x0015.GIF Type: image/gif Size: 73203 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990429/49f89d98/x0015.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: x0016.GIF Type: image/gif Size: 3244 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990429/49f89d98/x0016.gif From david at megginson.com Thu Apr 29 21:15:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:11:34 2004 Subject: intercepting internal entities? In-Reply-To: <84285D7CF8E9D2119B1100805FD40F9F2551F9@MDYNYCMSX1> References: <84285D7CF8E9D2119B1100805FD40F9F2551F9@MDYNYCMSX1> Message-ID: <14120.44410.24149.948978@localhost.localdomain> DuCharme, Robert writes: > Well, almost all. I couldn't find a mapping that let me output to > 7-bit ASCII (e.g. map auml, agrave, aacute, and acirc all to "a") > but I can see from the AnselInputStreamReader class that comes with > Mike Kay's GedML package (which demos his SAXON library) how to > derive my own StreamReader with my own mapping table. Look out > umlauts. Actually, for German, a-umlaut should map to 'ae', not 'a' (and o-umlaut to 'oe', and u-umlaut to 'ue'). Likewise, '?' should map to 'ss'. So, if you want to recode Goethe's line ... da? beide M?nner recht haben m?chten ... into US-ASCII, you would need to put out ... dass beide Maenner recht haben moechten ... The alternative, "das beide Manner recht haben mochten" looks very wrong to me, though I'm not a native (or even a very good) German speaker, and perhaps tastes have changed. It's interesting to note, though, that the '?' in old or middle high German was (I think) originally just a tiny 'e' writting above the vowel by the scribe. This stuff is always harder than you expect. I don't know the rules in Finnish for example, but I'm willing to bet that they handle ASCII-fication completely differently. And, of course, once you start transliterating non-Roman characters, the real fun begins. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sean at westcar.com Thu Apr 29 22:04:32 1999 From: sean at westcar.com (Sean Brown) Date: Mon Jun 7 17:11:34 2004 Subject: Dinky little XML and XSL GIFs for web pages? In-Reply-To: <3728460B.8DBDF0C4@iol.ie> Message-ID: Hi all.. I created 2 xml logos in the spirit of this thread for you to consider. They are both very lightweight and are exactly the same proportion as the other w3c buttons referenced in the original post. The images are located at: http://www.javanet.com/~sbrown/xml/ There is a feedback form and mini-poll there as well. -Sean On Thu, 29 Apr 1999, Alan Kennedy wrote: ::As you know, there are little buttons you can get for web sites ::that say things like /*/-------------------------------------------------------------------------/*/ | Sean Brown | "the highway is made out of lime jello | | Westcar Consulting Group | and my honda is a barbequed oyster!" | | | | | mailto:sean@westcar.com | http://www.westcar.com/sbrown/eCard.html | /*/-------------------------------------------------------------------------/*/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Thu Apr 29 22:14:51 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:34 2004 Subject: intercepting internal entities? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F2551FE@MDYNYCMSX1> We're dealing with English-language documents in which occasional company names and city names have foreign accents, but two names could come from two different languages, so we can't generalize about mapping rules. For example, one document could mention Lowenbr?u and the next could mention Nestl?s, which we're better off mapping to "Lowenbrau" (as opposed to "Lowenbraeu") and "Nestles" respectively for display on Bloomberg terminals, etc. tschuess, Bob > ---------- > From: David Megginson[SMTP:david@megginson.com] > Sent: Thursday, April 29, 1999 3:15 PM > To: 'xml-dev@ic.ac.uk' > Subject: RE: intercepting internal entities? > > DuCharme, Robert writes: > > > Well, almost all. I couldn't find a mapping that let me output to > > 7-bit ASCII (e.g. map auml, agrave, aacute, and acirc all to "a") > > but I can see from the AnselInputStreamReader class that comes with > > Mike Kay's GedML package (which demos his SAXON library) how to > > derive my own StreamReader with my own mapping table. Look out > > umlauts. > > Actually, for German, a-umlaut should map to 'ae', not 'a' (and > o-umlaut to 'oe', and u-umlaut to 'ue'). Likewise, '?' should map to > 'ss'. So, if you want to recode Goethe's line > > ... da? beide M?nner recht haben m?chten ... > > into US-ASCII, you would need to put out > > ... dass beide Maenner recht haben moechten ... > > The alternative, "das beide Manner recht haben mochten" looks very > wrong to me, though I'm not a native (or even a very good) German > speaker, and perhaps tastes have changed. It's interesting to note, > though, that the '?' in old or middle high German was (I think) > originally just a tiny 'e' writting above the vowel by the scribe. > > This stuff is always harder than you expect. I don't know the rules > in Finnish for example, but I'm willing to bet that they handle > ASCII-fication completely differently. And, of course, once you start > transliterating non-Roman characters, the real fun begins. > > > All the best, > > > David > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sean at westcar.com Thu Apr 29 22:33:04 1999 From: sean at westcar.com (Sean Brown) Date: Mon Jun 7 17:11:34 2004 Subject: Dinky little XML and XSL GIFs for web pages? In-Reply-To: Message-ID: Based on feedback, I made a new set without the exclamation/bang: http://www.javanet.com/~sbrown/xml/ /*/-------------------------------------------------------------------------/*/ | Sean Brown | "the highway is made out of lime jello | | Westcar Consulting Group | and my honda is a barbequed oyster!" | | | | | mailto:sean@westcar.com | http://www.westcar.com/sbrown/eCard.html | /*/-------------------------------------------------------------------------/*/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eisen at pobox.com Thu Apr 29 23:23:57 1999 From: eisen at pobox.com (jeisen) Date: Mon Jun 7 17:11:34 2004 Subject: Dinky little XML and XSL GIFs for web pages? References: Message-ID: <3728CF23.B18FF04A@pobox.com> Sean Brown wrote: > Hi all.. > > I created 2 xml logos in the spirit of this thread for you to consider. > They are both very lightweight and are exactly the same proportion as the > other w3c buttons referenced in the original post. > > The images are located at: > > http://www.javanet.com/~sbrown/xml/ > > There is a feedback form and mini-poll there as well. > > -Sean > This is nice eye candy, but I have a strong suggestion for all. Please DO NOT create any materials whether graphics, documents, books, articles, software or otherwise that furthers the confusion about XML declarations. Section 2.8 of the XML spec is clear that the declaration is lower case, i.e. While the graphics are not XML declarations, they do imply that one should use an uppercase XML as the root element of a document. This may confuse new users who would be tempted to do this: bar Am I anal retentive? Yes I am. And so is XML when it comes to syntax. This very thing also causes problems for new users of XML who are used to forgiving HTML parsers. Jonathan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smo at jst.com.au Fri Apr 30 04:20:44 1999 From: smo at jst.com.au (Steve Oldmeadow) Date: Mon Jun 7 17:11:34 2004 Subject: IBM C++ parser/C++ Builder Message-ID: <002901be92af$657d5fc0$0201a8c0@pikachu> IBM have a C++ Builder application which presents a tree view of an XML document that is not included with the XML4C distribution. It is provided 'as-is' and not supported. You can get it by e-mailing xml4c@us.ibm.com The background on this is that I noticed a screen shot of this application in the documentation and thought it may have been developed using C++ Builder because its icon was the default one used by C++ Builder. I posted a message on the XML4C discussion forum asking if such a thing existed and a couple of days later I get it e-mailed to me!!!! Is anyone else totally impressed by IBM's commitment to XML and apparent lack of greed? Steve Oldmeadow Justice Systems Technologies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Apr 30 05:10:21 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:11:34 2004 Subject: Patterns and location paths References: <93CB64052F94D211BC5D0010A80013310EB44F@WWMESS3.172.19.125.2> Message-ID: <3729166F.C1BDAE33@jclark.com> Kay Michael wrote: > > > Richard Tobin [mailto:richard@cogsci.ed.ac.uk] wrote > > it appears that > > > > foo//bar[5] > > > > means different things depending on where it appears. > > > > As a location path (see section 6.1), it selects a bar element that is > > a descendent of a foo child of the current node, and that is the fifth > > such element in document order. > > > > As a pattern (see section 6.3), it matches a bar element that is a > > descendent of a foo element, and that is the fifth foo child of its > > parent. > > > Actually, as a pattern it seems to mean different things depending which > explanation you read. Arguably the sentence below production [52] "A pattern > is defined to match a node..." is definitive, in which case by definition > the pattern means the same thing in both contexts. If this is true, however, > the "it is easy to understand" explanation that follows appears to be > incorrect. If there is a case where the definition and the following explanation don't coincide, then that's a bug in the spec. I don't know of any such cases. If you find one, please send a precise description to xsl-editors@w3.org. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Apr 30 05:10:49 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:11:34 2004 Subject: Patterns and location paths References: <199904281608.RAA18094@stevenson.cogsci.ed.ac.uk> Message-ID: <37291DAB.7C05102@jclark.com> Richard Tobin wrote: > Looking at the latest XSL draft > (http://www.w3.org/TR/1999/WD-xslt-19990421.html) > it appears that > > foo//bar[5] > > means different things depending on where it appears. > > As a location path (see section 6.1), it selects a bar element that is > a descendent of a foo child of the current node, and that is the fifth > such element in document order. That's not correct: foo//bar[5] selects any bar element that is a descendant of a foo child of the current node and that is the fifth bar child of its parent. What exactly in the draft led you to think otherwise? James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alank at iol.ie Fri Apr 30 06:49:36 1999 From: alank at iol.ie (Alan Kennedy) Date: Mon Jun 7 17:11:34 2004 Subject: XML/XSL Icons. Message-ID: <3729322B.582840DD@iol.ie> Since several people replied with suggested graphics for use as Logos/Icons for XML and XSL, I thought it would be a good idea to gather them all onto one page. That page is at http://www.iol.ie/%7ealank/xml/icons.htm If there is enough interest, I will conduct a form-based poll to determine the best one, at some unspecified date in the future. Meantime, if anyone else has any icons to propose, feel free to send them to me. Alan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sean at westcar.com Fri Apr 30 07:14:43 1999 From: sean at westcar.com (Sean Brown) Date: Mon Jun 7 17:11:34 2004 Subject: XML/XSL Icons. Message-ID: <002a01be92c8$4f3a2e00$708c5ed1@javanet.com> Alan, http://www.javanet.com/~sbrown/xml/ is generating poll statistics. I emailed garett privately to ask his permission to post hist image.... I think we've got some duplication of effort. There obviosly should be one definitive place. I'm more than happy to post anyone's images / or host the voting programs... The polling program is a perl-regex xml program I cobbled together this afternoon. In any case, I'll be sure to let you know of anything new. -Sean xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 30 11:22:03 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:34 2004 Subject: Patterns and location paths In-Reply-To: James Clark's message of Fri, 30 Apr 1999 10:04:11 +0700 Message-ID: <199904300921.KAA15946@stevenson.cogsci.ed.ac.uk> > That's not correct: foo//bar[5] selects any bar element that is a > descendant of a foo child of the current node and that is the fifth bar > child of its parent. Good, that's what I initially expected. > What exactly in the draft led you to think otherwise? My interpretation is as follows: "foo" selects the foo children of the current node For each of these, "//" selects its descendants "bar" filters these to select the bar descendants The predicate "[5]" is evaluated "with the complete list of nodes to be filtered as the context node list" (6.1.3) - ie, all the bar descendants of the current foo node. "5" is equivalent to "position()=5", and "position()" returns the position of the node in the context node list (6.2.2). So it selects the fifth of all the bar descendants. This is not what I said in my previous message - I was wrongly taking the context list to be all the bar descendants of all the foo children, rather than evaluating //bar[5] separately for each foo. But it's still not what you say above. And an example in section 6.1 seems to confirm my (revised) interpretation: /from-descendants(figure[position()=42]) selects the forty-second figure element in the document It doesn't say "selects any figure element that is the 42nd child of its parent". Am I still confused? -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Apr 30 12:16:46 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:11:34 2004 Subject: Patterns and location paths In-Reply-To: Richard Tobin's message of Fri, 30 Apr 1999 10:21:44 +0100 (BST) Message-ID: <199904301016.LAA17939@stevenson.cogsci.ed.ac.uk> Mike Kay has pointed out my error. I had thought that //bar expanded to from-descendants(bar) but in fact it expands to from-descendants-or-self(node())/from-children(bar) and this is indeed explicitly stated in the spec. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eriblair at mediom.qc.ca Fri Apr 30 13:41:33 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:11:35 2004 Subject: MSXML and IE5 ... Message-ID: <004501be92fe$9c06ea80$1f9ccb84@grr.ulaval.ca> Hi, I use the MS applet way to parse the XML file in that way: ... and with some modifs thats work in my machine (with IE5) ... but if I try to see it with another computer with IE5 or older version !!! ??? I remember with the older way (ex: XML parser in Java V1.9 ... outside of VM of IE ...) the files worked well. Does anybody have an answer ... Regards, ?ric Riblair, Agronome (eriblair@mediom.qc.ca) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990430/6054e7b6/attachment.htm From eriblair at mediom.qc.ca Fri Apr 30 13:54:51 1999 From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=) Date: Mon Jun 7 17:11:35 2004 Subject: MSXML vs IE5 ... Message-ID: <005f01be9300$76d75860$1f9ccb84@grr.ulaval.ca> Hi, I use the MS applet way to parse the XML file in that way: ... and with some modifs thats work in my machine (with IE5) ... but if I try to see it with another computer with IE5 or older version the files didnn't work ??? I remember with the older way (ex: XML parser in Java V1.9 ... outside of VM of IE ...) the files worked well. Does anybody have an answer ... Regards, ?ric Riblair, Agronome (eriblair@mediom.qc.ca) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990430/a3376d09/attachment.htm From mike at DataChannel.com Fri Apr 30 18:36:30 1999 From: mike at DataChannel.com (Mike Dierken) Date: Mon Jun 7 17:11:35 2004 Subject: intercepting internal entities? Message-ID: <8EAE75D3D142D211A45200A0C99B6023935759@ZEUS> You may have already got an answer, but here is an article about dynamically modifying entity references. It might be useful to you with this or another entity reference problem. http://xdev.datachannel.com/press/lounge.html Mike DataChannel -----Original Message----- From: DuCharme, Robert [mailto:DuCharmR@moodys.com] Sent: Tuesday, April 27, 1999 9:38 AM To: xml-dev@ic.ac.uk Subject: intercepting internal entities? Here's my problem: let's say I have a document with ä in it somewhere and auml properly declared as an internal entity in the DTD along with the other 589 internal entities shown at http://www.oasis-open.org/cover/xml-ISOents.txt. My application will write to one file headed for a Windows app (in which case I want the ä mapped to byte 228), a Mac document (where I want it mapped to byte 138) a DOS one (byte 132), a web page ("ä") and a Bloomberg terminal ("a"). In SGML, we would declare auml as an SDATA entity and the conversion tools would let me trap the internal entity event and map it appropriately depending on the output format. Using Java and SAX, I don't see any equivalent event to trap. SAX's EntityResolve interface seems restricted to external entities. I suppose I could scan all characters before outputting them and map the ones that need it, but if there are potentially 590 that need mapping, this seems inefficient. Is there a better way? Am I missing something? thanks, Bob DuCharme www.snee.com/bob "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Fri Apr 30 19:07:17 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:11:35 2004 Subject: xml-data DTD? Message-ID: <84285D7CF8E9D2119B1100805FD40F9F255205@MDYNYCMSX1> Does anyone know where I can find a working XML DTD for an xml-data schema? I tried fixing the various problems in the one in Appendix B of the W3C Note on XML-data (changing backwards single quotes to regular ones, " "The elements be kind to thee, and make thy spirits all of comfort!" Anthony and Cleopatra, III ii xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)