From andrewl at microsoft.com Thu Jan 1 00:14:45 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:59:42 2004 Subject: JavaScript XML Parser Message-ID: <7BB61B44F197D011892800805FD4F792024441E1@red-03-msg.dns.microsoft.com> You might also be interested in looking at http://www.insidedhtml.com/xml/poetry/page2.htm. --Andrew Layman AndrewL@microsoft.com > -----Original Message----- > From: mike@datachannel.com [SMTP:mike@datachannel.com] > Sent: Wednesday, December 31, 1997 2:41 PM > To: 'Jeremie Miller'; xml-dev@ic.ac.uk > Subject: JavaScript XML Parser > > Jeremie, > Beautiful. > > The only two suggestions I have are: > 1 Create DOM methods on the JavaScript objects. This way authors can use > any parser without changing their scripts. > 2 Don't go overboard with DTD/parameter entities/etc. handling. Full blown > parsers already exist, and they can be instantiated as an and > called from JavaScript. What is needed is a nice lightweight way to > programmatically read simple XML. > > If you want to really rock the world, do these two things: > 1 Write an XSL processor in JavaScript. > 2 Figure out how to read from multiple URLs from JavaScript, without > blowing away the current page. > > Mike D > DataChannel > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 1 10:33:31 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:42 2004 Subject: JavaScript XML Parser In-Reply-To: <011501bd1634$991a25a0$2c01a8c0@jeremie.dbqglass.com> Message-ID: <3.0.1.16.19980101113234.4dc706fa@pop3.demon.co.uk> At 15:39 31/12/97 -0600, Jeremie Miller wrote: >Well, I finally decided to take the plunge and learn XML. As a learning >project, I decided to write a simple XML parser in JavaScript(ECMAScript). Excellent. I think this is the first tool in ECMAScript reported on this list. ECMAScript is likely to play an important role in XML, since it is specifically mentioned in the current **draft** for XSL (the stylesheet specification). Experience will be very valuable. Is it too optimistic to think that it may be possible to build up a library of XML-E routines? >As it exists now, it doesn't handle many of the more advanced parts of the >spec(PI's, CDATA, etc...) and is only trying to be a read-only well-formed >parser. There is a real requirement for just such an application. Many of this year's XML documents will *not* have CDATA, NOTATION, PIs, DTDs, and entities. A question that we have to face soon is whether every parser/tool has to cater for everything in the spec. If not, how can such a parser gracefully handle a document beyond its capabilities? One way might be to have a heavier-weight tool lurking in the background, to be called only when help was needed. > >I feel its to a point where I can let others play with it and need some good >feedback on it. But remember, I have not even read the XML recommendation >more than a light glance-through, so the parser it fairly limited yet. The >point of it is to take XML fragments and expose them as a parsed object-tree >to other javascripts for manipulation/display. Its ~5k, its fast, and it >works with any ECMAScript compliant browser(I hope). These are excellent objectives. IMO it is critical that the XML community makes it easy for you and others to take this approach. In support of this most of the "real" examples of XML use a subset of the functionality and it would be interesting to see what percentage you can manage. If much XML becomes de facto DTD-less SGML, then there is a lot you don't have to worry about. > >It will get updated often as I have time to read the spec and to learn more >about DTD's. >Go play at: >http://www.jeremie.com/xparse/ > >I really need some constructive feedback about what it needs to do, the API, >possible uses for it, etc... It could be extremely valuable to include ECMAScript in the API discussions. We are currently addressing how to generate a "simple" API for XML. We have an offer of a language-independent IDL - can ECMAScript APIs be generated from such an API (I expect the answer is yes). > >Thanks! This list has been a very educational tool so far! As I learn >more, you'll probably be seeing more of me :) Excellent. IMO any *communal* resources that you can catalyse will be very valuable. Up till now I have been wary of JavaScript, but as ECMAScript becomes more robust and portable, communal XML-E resources will be very valuable. Possibly more valuable even than Java... P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 1 13:47:35 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:42 2004 Subject: Problems with whitespace and msxml In-Reply-To: <199712312252.RAA55936@mail1y-int.prodigy.net> References: <199712312252.RAA55936@mail1y-int.prodigy.net> Message-ID: <199801011343.IAA00285@unready.microstar.com> Alexander Hinds writes: [on xml:space] > Moreover, no matter what I set it to, I always get back whitespace > in my tree, even without a mixed content model (for example, for > element book, it's first sib is always whitespace). > My question, basically is: how do I eliminate whitespace from my > tree entirely? Or failing that how do I get the current value of > xml-space in my ElementImpl subclass? It appears that nameXMLSPACE > is private, not protected (why?) so a subclass can't really search > it. But even when I change the visibility, it's always null > anyway. I have not used msxml recently, so I do not know what it does, but the PR is very clear that the 'xml:space' attribute is strictly informative (from 2.10, "White Space Handling"): An XML processor must always pass all characters in a document that are not markup through to the application. A validating XML processor must distinguish white space in element content from other non-markup characters and signal to the application that white space in element content is not significant. A special attribute named "xml:space" may be inserted in documents to signal an intention that the element to which this attribute applies requires all white space to be treated as significant by applications. In other words, the value of xml:space should _not_ affect the information that msxml returns to your application; instead, it is up to your application to read the value, if present, and to take appropriate action. Msxml should return all whitespace, no matter what. I have heard rumours that xml:space may some day be removed from the core XML spec and put into a separate "XML Conventions" spec -- that would be a very good idea. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 1 14:48:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:42 2004 Subject: Problems with whitespace and msxml In-Reply-To: <199801011343.IAA00285@unready.microstar.com> References: <199712312252.RAA55936@mail1y-int.prodigy.net> <199712312252.RAA55936@mail1y-int.prodigy.net> Message-ID: <3.0.1.16.19980101154535.265fc1d4@pop3.demon.co.uk> Whitespace has been (and I suspect will continue to be) a frequent topic on XML-DEV :-) It can be a confusing topic and long-term members of XML-DEV are sympathetic and helpful when it is raised. (A). There is no simple one-groks-all solution to the problem. If there were, we should be using it :-) (B) a lot of material about whitespace has been written on this list, including 5 paragraphs from David Durand. You will find references to some of the discussion on XML-DEV jewels: (http://ala.vsms.nottingham.ac.uk/vsms/xml/jewels.html) At 08:43 01/01/98 -0500, David Megginson wrote: >Alexander Hinds writes: > >[on xml:space] > > > Moreover, no matter what I set it to, I always get back whitespace > > in my tree, even without a mixed content model (for example, for > > element book, it's first sib is always whitespace). > > My question, basically is: how do I eliminate whitespace from my > > tree entirely? Or failing that how do I get the current value of ^^^^^^^^^^^^^ By not including it in your document :-) > > xml-space in my ElementImpl subclass? It appears that nameXMLSPACE I have not managed to get msxml working yet, but assuming that you can retrieve attributes values, xml:space is a potential attribute for any element. The rules for its inheritance from root are given in the spec. > > is private, not protected (why?) so a subclass can't really search > > it. But even when I change the visibility, it's always null > > anyway. > >I have not used msxml recently, so I do not know what it does, but the >PR is very clear that the 'xml:space' attribute is strictly >informative (from 2.10, "White Space Handling"): > > An XML processor must always pass all characters in a document that > are not markup through to the application. A validating XML processor I find the phrase "validating XML processor" a confusing one because it refers to a piece of software. Validation requires: - enough information in the document to *allow* it to be validated (e.g. enough ELEMENT and ATTLISTs to cover all elements found in the document.) - a decision that the document *should* be validated. This may come from: - the author (implicit in the inclusion of a DTD and some PIs) - the client software (e.g. it makes decisions as to when to validate) - the human user ("press the validate button"). - software sufficiently powerful to map the content of an element on to its contentSpec. IOW the identification of ignorable whitespace (which is *mandatory* for a validating parser) depends on an unclear combination of the above. > must distinguish white space in element content from other non-markup ^^^^ It can only do this if the document allows it to... > characters and signal to the application that white space in element > content is not significant. > > A special attribute named "xml:space" may be inserted in documents to > signal an intention that the element to which this attribute applies > requires all white space to be treated as significant by applications. > >In other words, the value of xml:space should _not_ affect the >information that msxml returns to your application; instead, it is up >to your application to read the value, if present, and to take >appropriate action. Msxml should return all whitespace, no matter >what. And - assuming it calls itself a validating parser - *must* identify which of that whitespace is significant and signal that to the application. > >I have heard rumours that xml:space may some day be removed from the >core XML spec and put into a separate "XML Conventions" spec -- that >would be a very good idea. We should be careful not to act on rumours on XML-DEV. There is a carefully controlled process which requires discipline from those wishing to use XML. Some of the deliberations are confidential (e.g. XML-SIG - and as a member of that I cannot confirm or deny any speculations about what is discussed there). XML relies on the community adhering to the spec as closely as they can - this in itself is not easy. OTOH I have publicly made it clear that I think that conventions are going to be essential for the implementation of XML systems (and whitespace would be a strong candidate). This is why I have raised the idea of XDEV (an informal set of conventions aired on the list) and shall continue to pursue this. IFF the XML process formally wishes to set up a conventions WG or similar I shall be very happy, but until they announce something like that we cannot and should not assume it. P. > > >All the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 1 15:01:23 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:42 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <3.0.1.16.19980101154535.265fc1d4@pop3.demon.co.uk> References: <199712312252.RAA55936@mail1y-int.prodigy.net> <199801011343.IAA00285@unready.microstar.com> <3.0.1.16.19980101154535.265fc1d4@pop3.demon.co.uk> Message-ID: <199801011457.JAA00793@unready.microstar.com> > > An XML processor must always pass all characters in a document > > that are not markup through to the application. A validating > > XML processor must distinguish white space in element content > > from other non-markup What the PR means to say here is that a DTD-driven XML parser has to treat whitespace in element content differently than whitespace in mixed content -- this, of course, has nothing to do with xml:space. If there is no DTD, then all element types are assumed to allow mixed content, so a DTD-driven XML parser ("validating XML processor") would report all whitespace as significant. What should SAX do with ignorable whitespace? 1) Report it as a distinct event, like ?lfred does? 2) Treat it as regular character data? 3) Ignore it (as in regular SGML)? (1) seems to be what the PR requires. Either (2) or (3) could cause strange results. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 1 17:08:43 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:42 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <199801011457.JAA00793@unready.microstar.com> References: <3.0.1.16.19980101154535.265fc1d4@pop3.demon.co.uk> <199712312252.RAA55936@mail1y-int.prodigy.net> <199801011343.IAA00285@unready.microstar.com> <3.0.1.16.19980101154535.265fc1d4@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980101180739.1e972b32@pop3.demon.co.uk> [I think this discussion is another good reason why SAX is urgently needed] At 09:57 01/01/98 -0500, David Megginson wrote: > > > An XML processor must always pass all characters in a document > > > that are not markup through to the application. A validating > > > XML processor must distinguish white space in element content > > > from other non-markup > >What the PR means to say here is that a DTD-driven XML parser has to >treat whitespace in element content differently than whitespace in >mixed content -- this, of course, has nothing to do with xml:space. >If there is no DTD, then all element types are assumed to allow mixed >content, so a DTD-driven XML parser ("validating XML processor") would >report all whitespace as significant. I would agree with this interpretation and prefer the phrase "DTD-driven XML parser (?processor?)". I interpret this to mean: "a processor which uses any DTD information given in the document, and which uses it to do as much validation as it and the document are capable of." However, having read the spec more carefully, I am having great difficulty in deciding *where* it allows whitespace in element content. Take the document: ... My reading of the spec suggests that this is an *invalid* document. Please show me where I have gone wrong... FOO has declared element content [3.2.1]. "... elements of that type must contain only child elements ***(no character data)*** [my asterisks]..." for BAR: [3.2] An element is valid if there is a declaration matching elementdecl where the Name matches the element type and ... 1. the declaration matches EMPTY and the element has ***no content*** the context of content is [39] STag content ETag and its definition is: [43] (element | CharData | Reference | CDSect | PI | Comment)* Again there is no place for whitespace. Therefore I cannot see where (apart from [2.10] which raises the whitespace question) whitespace is can be defined as 'non-significant'. IOW whitespace ***in the content of an element*** is only formally allowed as CharData in mixed content, and in mixed content it must be significant. I am *sure* I've missed something here as the WG has debated this for ages, but I can't see where. > >What should SAX do with ignorable whitespace? Assuming that ignorable WS is found only in element content... > >1) Report it as a distinct event, like ?lfred does? >2) Treat it as regular character data? >3) Ignore it (as in regular SGML)? > >(1) seems to be what the PR requires. Either (2) or (3) could cause >strange results. (3) is forbidden - it has to be passed through. I think it has to be (2) and (1) simultaneously. IOW in an event mode you must report whitespace (space, 3 tabs, one newline, 10 spaces) occurs "now"; in tree mode you report "I have made you an element/node consisting of PCDATA, all whitespace - it's up to you to keep/destroy it..." P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu Jan 1 20:13:23 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:42 2004 Subject: The word "valid" in XML Message-ID: <199801012012.UAA08251@mail.iol.ie> It is probably way to late to do anything about it but the word "valid" bugs me in XML. Principally because of what happens when you invert it. A synonym for "valid" is "correct". But there is a big difference between "invalid XML" and "incorrect XML" due to the loading of the common word "valid". As far as I know "valid" has no SGML genesis. Does anyone know where it came from? "well formed" and "fully formed" or some such would make talking (and writing!) about XML a lot easier:-) Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ahinds at poboxes.com Thu Jan 1 21:22:05 1998 From: ahinds at poboxes.com (Alexander Hinds) Date: Mon Jun 7 16:59:42 2004 Subject: Problems with whitespace and msxml In-Reply-To: <199801011343.IAA00285@unready.microstar.com> Message-ID: <199801012116.QAA14498@mail1y-int.prodigy.net> > > In other words, the value of xml:space should _not_ affect the > information that msxml returns to your application; instead, it is up > to your application to read the value, if present, and to take > appropriate action. Msxml should return all whitespace, no matter > what. > > I have heard rumours that xml:space may some day be removed from the > core XML spec and put into a separate "XML Conventions" spec -- that > would be a very good idea. David- Thanks. Oddly enough, though according to MS' docs: -- Section 2.10 says that xml-space can be specified on any element controlling whether white space is preserved or normalized. The default is to normalize white space (which means unify all white space characters down to a single space). To preserve whitespace set xml-space to preserve, and this is inherited down the hierarchy. To switch back to the default, set xml-space to default -- Well, nomatter what I do it doesn't seem to do anything with the xml-space attribute. Moreover, it doesn't seem to actually set the attribute for any of my elements. For example, getAttribute(...) always returns null for xml-space. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 1 22:18:08 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:42 2004 Subject: The word "valid" in XML In-Reply-To: <199801012012.UAA08251@mail.iol.ie> References: <199801012012.UAA08251@mail.iol.ie> Message-ID: <199801012213.RAA00839@unready.microstar.com> Sean Mc Grath writes: > As far as I know "valid" has no SGML genesis. Does anyone know > where it came from? An SGML parser can be "validating" or "non-validating" -- non-validating parsers are not required to report any errors, but they must still read the DATA and produce the same output as validating parsers. XML does not allow conforming, non-validating parsers in the SGML sense (any parser without full error reporting is non-conforming), but the PR has used the term to mean something like "a DTD-driven parser with additional error-reporting requirements." All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 1 22:21:27 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:42 2004 Subject: Problems with whitespace and msxml In-Reply-To: <199801012116.QAA14498@mail1y-int.prodigy.net> References: <199801011343.IAA00285@unready.microstar.com> Message-ID: <3.0.1.16.19980101231455.376fe974@pop3.demon.co.uk> At 13:18 01/01/98 -0800, Alexander Hinds wrote: [...] >Thanks. Oddly enough, though according to MS' docs: > >-- >Section 2.10 says that xml-space can be specified on any element controlling >whether white space is preserved or normalized. The default is to normalize >white space (which means unify all white space characters down to a single >space). To preserve whitespace set xml-space to preserve, and this is >inherited down the hierarchy. To switch back to the default, set xml-space >to default This is a grey area, and one where I feel the spec gives little guidance. The spec requires a **processor** (many of us see this as synonymous with *parser*) to behave in the way that DavidM has described earlier. There is nothing in the spec describing any whitespace normalisation for the content of elements [1]. If, therefore, msxml is acting wholly as a "processor" (a la spec) it would appear *not* to be an XML-compliant processor from what you have quoted above. If it is a combined processor/application, then it should not be used as a "parser" or "processor" unless it is possible to intercept the information at the level of "parser API". I have been vociferous in wanting to develop conventions for this area, and this highlights the need for SAX and for conventions. There will clearly be a demand for an "HTML-like" normalisation of whitespace, but there is no public move towards defining such a convention. The difficulties that we are having here will be amplified when there are dozens of parsers/applications/ with no agreed output. P. [1] *Attribute values* may be normalised if they are known not to be CDATA [3.3.3], but there is no extension to content of elements. > >-- > >Well, nomatter what I do it doesn't seem to do anything with the xml-space >attribute. Moreover, it doesn't seem to actually set the attribute for any >of my elements. For example, getAttribute(...) always returns null for >xml-space. > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Fri Jan 2 12:20:27 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:42 2004 Subject: The word "valid" in XML References: <199801012012.UAA08251@mail.iol.ie> Message-ID: <34ACDB79.C852C580@mixx.de> did anyone consider using "verifying" to describe the behaviour of a processor which determines conformance to a dtd? it more accurately describes a process which has the goal of proving or ascertaining the status of a document, where the status includes "complying to the constraints expressed in the respective dtd" (or "valid", if one wishes), "not complying ..." (or "invalid", but it will be a source of confusion), and "unverified" (where "unvalidated" leaves much to be desired). among its advantages, the various negations are less likely to be misunderstood as implying or connoting "incorrect XML". the connotation has the advantage, that "verity" implies some "external reality" against which the document's "validity" is "verified", whereas "validity" on its own implies a judgement on principle, and thus leads to the confusion with "invalid XML". in any event, for this reader, that's an aspect of the proposal which i had to sort out in order to figure out what it actually meant... Sean Mc Grath wrote: > It is probably way to late to do anything about it but the word > "valid" bugs me in XML. Principally because of what happens when you > invert it. > > A synonym for "valid" is "correct". But there is a big difference > between "invalid XML" and "incorrect XML" due to the loading of the > common word "valid". > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Fri Jan 2 16:22:08 1998 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 16:59:42 2004 Subject: XLL: NEXT/PREVIOUS vs. PSIBLING/FSIBLING Message-ID: <199801021621.RAA19181@chimay.loria.fr> Hi, What is the difference between : NEXT and FSIBLING PREVIOUS and PSIBLING Pat. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Fri Jan 2 17:01:20 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:42 2004 Subject: XLL: NEXT/PREVIOUS vs. PSIBLING/FSIBLING Message-ID: <3.0.32.19980102120301.00a569b0@village.doctools.com> At 11:21 AM 1/2/98 -0500, Patrice Bonhomme wrote: > >Hi, > >What is the difference between : > NEXT and FSIBLING > PREVIOUS and PSIBLING Hi-- We're working on improving the explanations of these keywords; expect to see new drafts of XLL soon. In the meantime... FOLLOWING/PRECEDING (not NEXT/PREVIOUS) refers to the elements (or pseudo-elements) that start before/after the current one (the "location source"), anywhere in a document. One way to understand this is to imagine all the start-tags strung out on a line. Go left to get the PRECEDING elements of the location source, and go right to get the FOLLOWING elements. FSIBLING/PSIBLING refers to the elements (or pseudo-elements) before/after the location source that share the *same parent* as the location source. So PSIBLING identifies a subset of all PRECEDING elements, and FSIBLING identifies a subset of all FOLLOWING elements. I gave an XML/XLL tutorial at the SGML/XML '97 conference, and have put my PowerPoint slides up on our web site. (They're in .ppt 97 form.) You can find them at if you'd like to see the explanation I gave for how the Xpointer addressing keywords work. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 2 17:04:31 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:43 2004 Subject: XLL: NEXT/PREVIOUS vs. PSIBLING/FSIBLING In-Reply-To: <199801021621.RAA19181@chimay.loria.fr> Message-ID: <3.0.1.16.19980102175913.08873a90@pop3.demon.co.uk> At 17:21 02/01/98 +0100, Patrice Bonhomme wrote: > >Hi, > >What is the difference between : > NEXT and FSIBLING > PREVIOUS and PSIBLING NEXT and PREVIOUS were keywords which became obsolete in the revision of 970731, and were effectively replaced by FSIBLING and PSIBLING. I *hope* I'm right on this, and haven't missed a later revision, because I have implemented it this way in JUMBO. I don't know when the next public XLL revision is due - it's nearly 6 months old now. I am particularly keen to know the case of these and other keywords (e.g. attribute names and values), because changing them is a lot of effort :-) The repeal of case-insensitivity took place after 970731. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Fri Jan 2 18:12:34 1998 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 16:59:43 2004 Subject: Problems with whitespace and msxml Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C099FDD@red-msg-56.dns.microsoft.com> Ok, sorry it took so long to respond. There are a number of problems I see in your example: 1) The xml-space values are now lowercase, so you should have: xml-space (default|preserve) "default" Or, you can also use namespaces instead and have "xml:space". On the ATTLIST this says that it is valid to specify xml-space if you want to. 2) Next you must also specify xml-space="preserve" on the instance that you want whitespace preserved. So for example: ... As for WHITESPACE nodes :- the other confusing issue here is that you are talking about WHITESPACE nodes in the tree. This is a whole different subject. These are ignorable whitespace nodes as defined in the XML DOM spec (see DOM Working Group on www.w3.org for details). These nodes are always returned in the tree and are "ignorable". The main reason for their existance is so that the result of Document.save() looks very similar to the original document that was loaded. The only thing really affected by "preserve" is a PCDATA node. When in the scope of an element with the xml-space="preserve" attribute, the PCDATA node will not normalize whitespace. Hope this helps. > -----Original Message----- > From: Alexander Hinds [SMTP:ahinds@poboxes.com] > Sent: Wednesday, December 31, 1997 2:55 PM > To: xml-dev@ic.ac.uk > Subject: Problems with whitespace and msxml > > Forgive me if this has been discussed before, but I download the latest > msxml.tar.gz from Microsoft's web site (release notes dated Dec 4) and am > having a devil of a time with getting it to do the right thing with > whitespace. > > For one thing, despite what the docs say, it seems to insist on: > > xml-space (DEFAULT | FIXED) 'DEFAULT' > > > > instead of "default | preserve". > > Moreover, no matter what I set it to, I always get back whitespace in my > tree, even without a mixed content model (for example, for element book, > it's first sib is always whitespace). > > My question, basically is: how do I eliminate whitespace from my tree > entirely? Or failing that how do I get the current value of xml-space in > my > ElementImpl subclass? It appears that nameXMLSPACE is private, not > protected (why?) so a subclass can't really search it. But even when I > change the visibility, it's always null anyway. > > Any help or suggestions would be most appreciated. Thanks in advance. > > ---book DTD--- > > > > > > > > > > > name CDATA #REQUIRED > author CDATA #REQUIRED > xml-space (DEFAULT | FIXED) 'DEFAULT' > > > > name CDATA #REQUIRED > > > > ]> > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Jan 2 18:13:22 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:43 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) Message-ID: <3.0.32.19980102101301.00970e60@pop.intergate.bc.ca> At 09:57 AM 01/01/98 -0500, David Megginson wrote: >What the PR means to say here is that a DTD-driven XML parser has to >treat whitespace in element content differently than whitespace in >mixed content -- this, of course, has nothing to do with xml:space. >If there is no DTD, then all element types are assumed to allow mixed >content, so a DTD-driven XML parser ("validating XML processor") would >report all whitespace as significant. > >What should SAX do with ignorable whitespace? SAX must pass all whitespace in the document, without exception, through to the application. This is the only remotely conformant behavior. The mark-insignificant-WS stuff only is possible in the case you're validating, and users of SAX surely do not expect a validating-processor underneath. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Jan 2 19:40:15 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:43 2004 Subject: Validity question Message-ID: <199801021940.TAA27701@mail.iol.ie> Is this document VALID xml? !DOCTYPE foo [ !ELEMENT foo (bar+)> ]> There is a validity constraint in section 3.2 to the effect that it is not an error to have an element type mentioned in a content model that is not declared anywhere. But is it an error if the document proceeds to use the undeclared element type? msxml thinks it is valid. nsgmls does not. Opinions? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Jan 2 19:43:05 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:43 2004 Subject: Validity question (repost: typos corrected!) Message-ID: <199801021942.TAA27866@mail.iol.ie> Is this document VALID xml? ]> There is a validity constraint in section 3.2 to the effect that it is not an error to have an element type mentioned in a content model that is not declared anywhere. But is it an error if the document proceeds to use the undeclared element type? msxml thinks it is valid. nsgmls does not. Opinions? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Fri Jan 2 19:48:04 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:43 2004 Subject: Validity question In-Reply-To: <199801021940.TAA27701@mail.iol.ie> (message from Sean Mc Grath on Fri, 2 Jan 1998 19:40:07 GMT) Message-ID: <199801021952.OAA24513@geode.ora.com> [Sean McGrath] > Is this document VALID xml? > > > !DOCTYPE foo [ > !ELEMENT foo (bar+)> > ]> > > > > > There is a validity constraint in section 3.2 to the effect that it > is not an error to have an element type mentioned in a content model > that is not declared anywhere. But is it an error if the document > proceeds to use the undeclared element type? > > msxml thinks it is valid. nsgmls does not. > > Opinions? Opinions, nothing. Fact: PR-xml-971208 has, after production [46], "VC: Element Valid. An element is valid if there is a declaration matching elementdecl ([45]) where the Name matches the element type, and one of the following holds:..." Since there is no elementdecl whose name matches "bar", the element is invalid. (Personally, I think this VC belongs in 3.1, not 3.2, and have said so to the editors.) Tip 1: If a document is not valid SGML (post-WebSGML), it's probably not valid XML. Hunt around in the spec. If it is, the XML spec probably needs fixing. Tip 2: If nsgmls says that a document is not valid SGML, it probably is not. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 2 20:46:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:43 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <3.0.32.19980102101301.00970e60@pop.intergate.bc.ca> References: <3.0.32.19980102101301.00970e60@pop.intergate.bc.ca> Message-ID: <199801022042.PAA14862@unready.microstar.com> Tim Bray writes: > SAX must pass all whitespace in the document, without exception, > through to the application. This is the only remotely conformant > behavior. The mark-insignificant-WS stuff only is possible in > the case you're validating, and users of SAX surely do not expect > a validating-processor underneath. -Tim I would like to avoid making any assumptions about what sort of parser is underneath. ?lfred, for example, does not validate, but it does read the content models and does distinguish ignorable whitespace. In fact, unless I'm mistaken, the four major Java-based XML parsers -- NXP, Lark, MSXML, and ?lfred -- _all_ use a DTD if present, and are all capable of distinguishing ignorable whitespace. With that in mind, would it be conformant behaviour for SAX to report ignorable whitespace as regular character data when the parser underneath is using the DTD (and, say, supplying defaulted attribute values)? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at surewould.com Fri Jan 2 21:08:33 1998 From: chris at surewould.com (Chris Lalos) Date: Mon Jun 7 16:59:43 2004 Subject: Newbie: Developing DB Markup language Message-ID: <34AD5795.1298@surewould.com> I'm very new to XML. I'm interested in writing a little mark-up language for DB-aware Web pages (I know, D-minus for originality). I'd be doing the processing via a Java servlet. I'm wondering if XML is a good idea for this or not. Any thoughts? By the way, the grammar and spelling on this list are excellent, as I suppose one should expect. My compliments. - F.W. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Jan 3 01:34:39 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:43 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) Message-ID: <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> At 03:42 PM 02/01/98 -0500, David Megginson wrote: >With that in mind, would it be conformant behaviour for SAX to report >ignorable whitespace as regular character data when the parser >underneath is using the DTD (and, say, supplying defaulted attribute >values)? Lark, BTW, does *not* catch ignorable white space unless it is validating. Since it is perfectly OK to build SAX with such a processor, *if* we want to build ignorable white-space notification into SAX, it has to be out-of-band; i.e. white space is passed in the same way as all other content; with perhaps another boolean argument to the text() method (that what it's called now?) that if true, means this is ignorable white space. But I would oppose doing this in SAX; let's keep it simple for now. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 02:48:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:43 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> References: <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> Message-ID: <199801030243.VAA00777@unready.microstar.com> Tim Bray writes: > Lark, BTW, does *not* catch ignorable white space unless it is > validating. Since it is perfectly OK to build SAX with such a > processor, *if* we want to build ignorable white-space notification > into SAX, it has to be out-of-band; i.e. white space is passed in > the same way as all other content; with perhaps another boolean argument > to the text() method (that what it's called now?) that if true, means > this is ignorable white space. Thank you for the reply, Tim. I would like to make certain, however, that I understand the behaviour that you're recommending. If a DTD-driven parser finds ignorable whitespace, and if we decide that SAX should not provide ignorable whitespace notification, then which of the following is the correct action? 1) the parser should not report the whitespace; or 2) the parser should report the whitespace as regular character data. >From my reading of the PR, and from my understanding of your comments, you are recommending (2); in other words, given the following document: ]> one bar two bars A DTD-driven parser would report something like the following events through SAX: - start document - start element: "foo" - character data: "\n" - start element: "bar" - character data: "one bar" - end element: "bar" - character data: "\n" - start element: "bar" - character data: "two bars" - end element: "bar" - character data: "\n" - end element: "foo" - end document In full SGML, you'd get something a little simpler, because the whitespace in element content would be discarded: - start document - start element: "foo" - start element: "bar" - character data: "one bar" - end element: "bar" - start element: "bar" - character data: "two bars" - end element: "bar" - end element: "foo" - end document > But I would oppose doing this in SAX; let's keep it simple for now. -T. Sounds reasonable. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremie at netins.net Sat Jan 3 03:19:39 1998 From: jeremie at netins.net (Jeremie Miller) Date: Mon Jun 7 16:59:43 2004 Subject: Update: JavaScript XML Parser Message-ID: <01bd17f6$44e9ab40$2801a8c0@jeremie.dbqglass.com> I just finished updating the parser to fix a few bugs and add some more features. It now parses CDATA, PI's, and Comments. As suggested, I'll be reading the DOM spec and add some compatibility to the API. I also looked at the XSL proposal and it looks like I can easily do some of the simplier parts of it. The only drawback to using JavaScript has been the inability to directly access URL's, all of the input has to be pasted in via a textarea. Thanks for the feedback so far, hopefully this can be put to some good use! http://www.jeremie.com/xparse/ Also, I have LOTS of questions about the XML spec, but I am going to bite my toung until I get a better grasp on it... Jeremie Miller jer@jeremie.com http://www.jeremie.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 09:37:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:43 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <199801030243.VAA00777@unready.microstar.com> References: <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980103103124.3c1741ec@pop3.demon.co.uk> At 21:43 02/01/98 -0500, David Megginson wrote: >Tim Bray writes: [...] > > But I would oppose doing this in SAX; let's keep it simple for now. -T. I strongly support the word 'simple'. I also re-emphasise the importance of SAX. I have now created a (rather ad hoc) installation of three parsers (AElfred, Lark, NXP) under JUMBO. The installation has been in alphabetical order :-), so most *recent* experience has been gained with AElfred. To a certain extent I'm waiting for the next releases of the other two in case they have significant changes. A lot of the work has been general (how does a GUI interface to a parser) but the following may be relevant to SAX: - is it possible for an author to extend SAX with additional non-SAX calls? [In the same way as a C library may have the standard calls and some additional manufacturer-specific ones]. Thus if we agree that doNotation functionality is not part of SAX, could a core SAX interface be extended by a parser writer without breaking the SAX bit? [I assume yes, but I don't know about interface design]. - if it *is* possible to extend it in this way, can we make sure we get the core as simple as possible so we all agree on it? Parser writers can then add additional non-standard functionality [carefully documented, of course :-)] - there is enormous value for hackers like me to be able to find a simple core functionality and get that working rapidly. Then the additional features can be gradually brought in. - what is the position on error handling? In GUI applications like JUMBO it's important to let the user know what is happening, so I have trapped the AElfred errors. [I have used a subclass of Error rather than Exception, because I think that avoids me having to edit and recompile AElfred.] The information that AElfred reports (URL for entity, line number, textual description of error (expected/found)) is very useful and I have built a little GUI that brings up the document with the errors highlighted (dumbly, since TextArea doesn't allow me colours). [Note that errorhandling is a much debated concern in XML and I am NOT suggesting that SAX addresses this comprehensively or we shall be here for months :-)]. But SAX must report the error, and I think the entityUrl and the line number and a (non-standard) message are simple and sufficient at this stage. Whether a parser stops at the first error is outside our scope at present :-) I'm looking forward to it - more volunteers (we are NOT limited to Java, if you can work from GavinN's IDL) would be very welcome. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 12:37:51 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <3.0.1.16.19980103103124.3c1741ec@pop3.demon.co.uk> References: <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> <199801030243.VAA00777@unready.microstar.com> <3.0.1.16.19980103103124.3c1741ec@pop3.demon.co.uk> Message-ID: <199801031233.HAA00346@unready.microstar.com> Peter Murray-Rust writes: > - is it possible for an author to extend SAX with additional non-SAX > calls? [In the same way as a C library may have the standard calls and some > additional manufacturer-specific ones]. Thus if we agree that doNotation > functionality is not part of SAX, could a core SAX interface be extended by > a parser writer without breaking the SAX bit? [I assume yes, but I don't > know about interface design]. If you're discussing the parser class, then it is free to implement any functionality beyond the interface; if you are discussing the application (call-back) class, then there are two choices: 1) it may implement another, tool-specific interface as well as SAX; or 2) it may implement an interface that extends the SAX interface (say, by adding lexical events like comments or by adding DTD events). In either case, you could still use your application class with other XML parsers, but the additional methods would never be called. > - if it *is* possible to extend it in this way, can we make sure we get > the core as simple as possible so we all agree on it? Parser writers can > then add additional non-standard functionality [carefully documented, of > course :-)] > - there is enormous value for hackers like me to be able to find a simple > core functionality and get that working rapidly. Then the additional > features can be gradually brought in. True, but it's never quite so simple, because every parser writer would implement the additional functionality in a different way. We have to make certain that we have covered at least the core features, and that means that we have to agree on what the core features are (I'll follow up with a separate message). > - what is the position on error handling? In GUI applications like JUMBO > it's important to let the user know what is happening, so I have trapped > the AElfred errors. I think that we may want to distinguish fatal errors from warnings (although ?lfred doesn't currently do so). Normally, a warning would print a message to a log or to STDERR, while an error would throw a java.lang.Error or a java.lang.Exception. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 12:56:05 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: towards a solution Message-ID: <199801031251.HAA00419@unready.microstar.com> We have had an interesting discussion about SAX ("simple API for XML"? I cannot remember) the past few weeks, and now it's time to get specific. I will be posting a series of separate messages, each containing a single SAX design question, and will look for a consensus on each one. Here are the design topics that I will be posting on over the weekend: 1) Document start and end. 2) External entity start and end. 3) External entity resolution. 4) Error reporting. 5) Whitespace handling. 6) Processing instructions. 7) Comments. 8) Doctype declaration. 9) Parser interface. 10) Naming and packaging. ASSUMPTIONS ----------- Before we start, I am assuming that we will all accept the following three events without further discussion, since no one objected last time: startElement (String name, java.util.Dictionary attributes) endElement (String name) charData (char ch[], int length) I am also assuming that we will provide not only a callback interface, but also an (optional) base class with stub methods that implementors can override as needed; that means that novice users will not have to implement all of SAX, even if we do end up with nine or ten methods. For example, if you're only interested in character data, you can try something like this: public class MyApplication extends XmlAppBase { public void charData (char ch[], int length) { System.out.println(new String(ch, 0, length)); } } There is not need for the programmer to worry about startElement(), endElement(), since they are already implemented as empty stubs in XmlAppBase: import java.util.Dictionary; public class XmlAppBase implements XmlApplication { { public void startElement (String name, Dictionary attributes) {} public void endElement (String name) {} public void charData (char ch[], int length) {} [...] } _Please_ keep this model in mind when you are commenting on the relative simplicity or complexity of SAX for users (as opposed to parser programmers) -- extra functionality is cheap, since it can be hidden away like this. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 15:09:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:44 2004 Subject: SAX and whitespace (was Re: Problems with whitespace and msxml) In-Reply-To: <199801031233.HAA00346@unready.microstar.com> References: <3.0.1.16.19980103103124.3c1741ec@pop3.demon.co.uk> <3.0.32.19980102173418.009abae0@pop.intergate.bc.ca> <199801030243.VAA00777@unready.microstar.com> <3.0.1.16.19980103103124.3c1741ec@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980103152321.3fdff9b0@pop3.demon.co.uk> At 07:33 03/01/98 -0500, David Megginson wrote: >Peter Murray-Rust writes: > > > - is it possible for an author to extend SAX with additional non-SAX > > calls? [In the same way as a C library may have the standard calls and [...] > >2) it may implement an interface that extends the SAX interface (say, > by adding lexical events like comments or by adding DTD events). > >In either case, you could still use your application class with other >XML parsers, but the additional methods would never be called. I assume this is what AElfred does at present - if so, I'm very happy with it. The way I have written things is that the Parser (AElfred) makes calls to JUMBO - these calls are not yet implemented for Lark/NXP. [...] simple > > core functionality and get that working rapidly. Then the additional > > features can be gradually brought in. > >True, but it's never quite so simple, because every parser writer >would implement the additional functionality in a different way. We I am quite happy - at this stage - to have to do something different for each Parser for the *non-core* features. If 50% of an interface is covered by SAX, the time taken to implement the rest is a lot less than if starting from scratch. >have to make certain that we have covered at least the core features, >and that means that we have to agree on what the core features are >(I'll follow up with a separate message). Yup - I'll reply. > > > - what is the position on error handling? In GUI applications like JUMBO > > it's important to let the user know what is happening, so I have trapped > > the AElfred errors. > >I think that we may want to distinguish fatal errors from warnings >(although ?lfred doesn't currently do so). Normally, a warning would >print a message to a log or to STDERR, while an error would throw a >java.lang.Error or a java.lang.Exception. I'm not terribly keen on STDERR since this can remain hidden in a Java console or a DOS window - and I trap all the AElfred calls I can get hold of. (I have written a general JumboError subclass or Error, where all the error information can be packed - different for each parser - and sent to JUMBO. Seems to work quite nicely). There are also errors which are not trappable in this way - for instance FileNotFoundException is not thrown through doError(). I believe there should only be one class of error message (rather like one class of Event in java.awt) with members like Object arg into which you can pack anything you like. IMO error handling is going to evolve over the next year and some sort of symbiosis will be found between the parser writers (who "may" do some things and "must" do others) and the applications which can take benign or Draconian action on receiving those errors. What probably isn't much fun is if it's not easy to find out *what* the parser has done. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 15:09:55 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: towards a solution In-Reply-To: <199801031251.HAA00419@unready.microstar.com> Message-ID: <3.0.1.16.19980103160221.2cefbaca@pop3.demon.co.uk> At 07:51 03/01/98 -0500, David Megginson wrote: >We have had an interesting discussion about SAX ("simple API for XML"? >I cannot remember) the past few weeks, and now it's time to get >specific. I will be posting a series of separate messages, each >containing a single SAX design question, and will look for a consensus >on each one. Here are the design topics that I will be posting on >over the weekend: David - this is brilliant. It is exactly the way that the XML-WG/SIG works - and that works very well. For the benefit of those who haven't read the public XML-SIG archives (used to be called WG, confusingly) it is one of the best decision-making processes I have come across anywhere, since it combines precision, democracy, adherence to deliverables, etc. Works like this: There is an editorial group (originally Editorial Review Board (ERB), now XML-WG), composed of W3C member representatives and a few invited experts. Currently 16 - see the PR. It has deliverables set by the W3C processes. They invite a wider group (XML-SIG) - about 100 I think - which helps them like this. The WG propose a draft spec (or part of a spec) and have done this for XML, XLL and XSL. Sometimes they will ask for general comments, other times they will ask quite specific questions - exactly as DavidM is doing. The SIG members may then respond in whatever way they feel is appropriate - sometimes discussing details, sometimes arguing about strategy. On occasion the chair (Jon Bosak) may decide that the discussion is out of scope (off-topic). The standard of discipline is very high, and members invariably respect this. The WG "meets" (phone conference) every week or so and makes formal decisions . Votes can be taken and recorded. These are reported back to the SIG as appropriate - sometimes continued discussion is requested. The WG does a *great* deal of work - various members at SGML 97 Europe recounted that XML had taken over their life. I believe that different specialities are assigned to different individuals on occasion. I have written this at length because I would like to see if XML-DEV could "borrow" some of this process. [There are of course many differences - the membership of XML-DEV is self-selecting (but still of very great technical quality and list discipline). Of course there is a lot of overlap with the SIG and WG.]. In the current situation I'd suggest that - if they can - DavidM and TimB (no more) form a mini-WG and act as DavidM has suggested. If this is too difficult (and of course *I* can't pay their phone bills), then DavidM should do this unilaterally. The miniWG should then solicit comments on the topics. Anyone should feel to respond, but it should be with the aim of trying to help the miniWG make decisions. [Offers of help may be very valuable here.] At the end of the determined period (measured in days, not longer) the miniWG will present the proposal. At that stage I would expect the proposal to be largely finalised apart from some detailed corrections or clarifications. *** When the miniWG submit their request for comments, could they use a clear and simple subject for each topic, and could respondents use precisely that***. e.g. SAX: DOCSTART/END. Also, let us stick closely to the goal of simplicity. > >1) Document start and end. >2) External entity start and end. >3) External entity resolution. >4) Error reporting. >5) Whitespace handling. >6) Processing instructions. >7) Comments. >8) Doctype declaration. >9) Parser interface. >10) Naming and packaging. Looks fine to me. I'm not saying I'm *agreeing* with all these topics - but that there are reasonable ones to ask questions about. Example: "Should SAX consider whitespace?" - I would guess some people would answer "No". If the miniWG takes the sense of the community, and marries it with the chance of achieving something, then some decisions may indeed end up as NO. If it makes sense to take these topics in some order, it may be useful to do so (the WG often does not issue everything at once.) For example, Naming and packaging might be asynchronous from whitespace. > > >ASSUMPTIONS >----------- > >Before we start, I am assuming that we will all accept the following >three events without further discussion, since no one objected last >time: > > startElement (String name, java.util.Dictionary attributes) > endElement (String name) > charData (char ch[], int length) Agreed. > >I am also assuming that we will provide not only a callback interface, >but also an (optional) base class with stub methods that implementors >can override as needed; that means that novice users will not have to >implement all of SAX, even if we do end up with nine or ten methods. >For example, if you're only interested in character data, you can try >something like this: > > public class MyApplication extends XmlAppBase { > > public void charData (char ch[], int length) > { > System.out.println(new String(ch, 0, length)); > } > > } > >There is not need for the programmer to worry about startElement(), >endElement(), since they are already implemented as empty stubs in >XmlAppBase: I have found this very useful with AElfred. Does this have implications for other languages (e.g. tcl is not OO - at least not when I used to use it.)? [...] >_Please_ keep this model in mind when you are commenting on the >relative simplicity or complexity of SAX for users (as opposed to >parser programmers) -- extra functionality is cheap, since it can be >hidden away like this. > Agreed - the ability to "bring in " new functionality as one's application evolves is very useful. Having to do everything at the start can be tough - there is not only more code to implement, but unnecessary concepts have to be learned. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 15:10:13 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: External Entity Start and End (question 2 of 10) Message-ID: <199801031504.KAA00633@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX generate events for the start and end of an external entity? public void startEntity (String ename, String systemID); public void endEntity (String ename); CON --- - these will make SAX slightly larger; - these belong to the physical structure of the XML document, while SAX concentrates on logical structure; - these will be difficult concepts for non-XML people to understand; - a SAX application should always see a single document, regardless of the entity structure. PRO --- - if an application is derived from XmlAppBase, it can simply ignore these events if it doesn't need them; - many XML-related proposals use URLs or URIs in attribute values and in processing instructions; if relative URIs need to be resolved against the location of the external entity rather than the location of the document entity, then the user will have to have a way to determine the location of the current external entity. MY RECOMMENDATION ----------------- Qualified yes. It may be that without these events, SAX is incapable of supporting XLL, XSL, or architectural forms; if this turns out to be the case, then SAX will not be useful for a large range of XML applications. The key is whether, when I have something like link to foo an application is supposed to resolve "foo.xml" against the URI of the current entity, or the URI of the document entity. OTHER CONSIDERATIONS -------------------- Are public IDs important enough to be included? public void startExternalEntity (String ename, String publicID, String systemID) We could simplify things further for most users if the XmlAppBase class implemented final versions of these handlers and maintained its own entity stack, providing an additional getCurrentEntity() query method (not part of the interface). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 15:10:28 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Document Start and End (question 1 of 10) Message-ID: <199801031447.JAA00628@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] In addition to the core events, should SAX have additional callbacks for the start and end of a document? public startDocument () public endDocument () CON --- - two additional methods will make the API slightly larger; - the start of an XML document can be inferred from the first event (element or possibly PI or comment), and any start/end handling could be done outside of the callback interface (say, before and after running the parse); however, the end cannot be easily inferred within an event handler. PRO --- - avoids an "if/then" test in handlers to see if they are the first events; - provides an easy and obvious place for initialization and cleanup, if startDocument() is always the first event reported and endDocument() is always the last; - these are very easy to implement in a separate SAX front end, without modifying any of the core parsers. MY RECOMMENDATION ----------------- Yes. These events will make code using SAX much simpler and cleaner, and they come at a very low cost. Furthermore, if a SAX-based application extends the XmlAppBase base class, then it can simply ignore these if they are not needed. OTHER CONSIDERATIONS -------------------- Should either of the events take arguments? For example, the startDocument event handler could take Strings giving the public ID (if any) and URI of the document, and the endDocument event handler could take integers giving the number of errors and warnings: public void startDocument (String publicID, String systemID); public void endDocument (int errors, int warnings); The latter, however, is very easy to track, and the former can be supplied to the constructor when the SAX event-handler is created, so both are redundant (if slightly convenient). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 15:27:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: External Entity Resolution (question 3 of 10) Message-ID: <199801031522.KAA00737@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX provide a handler for resolving external entities? public String resolveEntity (String ename, String publicID, String systemID); The handler would receive the system identifier, optionally accompanied by an entity name and public identifier if available, and would return the system identifier (URI) that the parser should use for the entity, or null if the entity should be skipped. CON --- - makes the API slightly larger; - requires modification to existing parsers -- cannot be supported simply with a new front-end; - deals with physical rather than logical structure; - a local proxy server could implement much of the same functionality; - could be confusing for non-specialists. PRO --- - could be implemented in XmlAppBase, so that most users could simply ignore it (the default implementation would always return the systemID argument unmodified); - allows redirection of URIs to local copies, if available (and to other protocols); - allows resolution of public identifiers using a table lookup; - allows a user to skip external entities if they are not necessary or desirable (if, for instance, they outside of the company LAN, or if they require payment for use). MY RECOMMENDATION ----------------- Undecided: no recommendation. If there is a default implementation hidden in XmlAppBase, then most users can simply ignore this method and rely on the default behaviour. This is a very simple and very powerful tool, especially for people with limited or no outside Internet access, or for people working in secure environments. Much of this could also be done with proxy servers, of course, but that is a solution more for companies than for individuals. The problem is that most parsers don't support this functionality right now, so we could not simply implement a new SAX front-end on top of the parser's existing API. On the other hand, we could make support for this optional, and add an entityResolutionSupported() boolean call to the XmlParser interface (see question 9, to be posted later). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 15:34:47 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: towards a solution In-Reply-To: <3.0.1.16.19980103160221.2cefbaca@pop3.demon.co.uk> References: <199801031251.HAA00419@unready.microstar.com> <3.0.1.16.19980103160221.2cefbaca@pop3.demon.co.uk> Message-ID: <199801031530.KAA00775@unready.microstar.com> Peter Murray-Rust writes: > I have written this at length because I would like to see if XML-DEV could > "borrow" some of this process. [There are of course many differences - the > membership of XML-DEV is self-selecting (but still of very great technical > quality and list discipline). Of course there is a lot of overlap with the > SIG and WG.]. In the current situation I'd suggest that - if they can - > DavidM and TimB (no more) form a mini-WG and act as DavidM has suggested. > If this is too difficult (and of course *I* can't pay their phone bills), > then DavidM should do this unilaterally. I cannot speak for Tim, but I would guess that he's probably busy enough with the main WG that he wouldn't want to spend any more time on the phone. I think that SAX is a small enough topic that we should be able to manage it through our existing channels of communication; of course, the opinions of the parser writers (Tim, Norbert, Chris, me, and any others) will carry proportionally heavier weight, as will the opinons of potential SAX implementors (such as Peter), and all opinions count. I will collect opinions, then will construct a trial SAX and try to get acceptance. > I have found this very useful with AElfred. Does this have implications for > other languages (e.g. tcl is not OO - at least not when I used to use it.)? > > [...] There is an OO add-on to Tcl, called something like iTcl (it's been a while); in any case, I'd like to keep SAX as an object-oriented API. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Jan 3 16:22:55 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Document Start and End (question 1 of 10) Message-ID: <199801031622.QAA20950@mail.iol.ie> At 09:47 03/01/98 -0500, you wrote: >[SAX is a proposal for a simple, event-based XML API, using >callbacks. This is one in a series of ten design questions that we >need to answer to implement the API.] > >In addition to the core events, should SAX have additional callbacks >for the start and end of a document? > > public startDocument () > public endDocument () > I vote yes. The initialisation/cleanup argument in favour is, I think, compelling. >Should either of the events take arguments? For example, the >startDocument event handler could take Strings giving the public ID >(if any) and URI of the document, Sounds good. > and the endDocument event handler > could take integers giving the number of errors and warnings: What about fatal errors? After the first of these, unprocessed data (intermingled character data and markup) may be communicated to the application for debugging purposes. What channel does it go through? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Jan 3 16:23:12 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: External Entity Start and End (question 2 of 10) Message-ID: <199801031622.QAA20968@mail.iol.ie> > >Should SAX generate events for the start and end of an external >entity? > > public void startEntity (String ename, String systemID); > public void endEntity (String ename); > > [...] > >We could simplify things further for most users if the XmlAppBase >class implemented final versions of these handlers and maintained its >own entity stack, providing an additional getCurrentEntity() query >method (not part of the interface). This sounds good to me. Entity info available for those who want it but not explicitly announced in the callbacks. Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 16:51:09 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Document Start and End (question 1 of 10) In-Reply-To: <199801031447.JAA00628@unready.microstar.com> Message-ID: <3.0.1.16.19980103172640.3fdf1320@pop3.demon.co.uk> Yes. I have publicly posted this to the list to show how simple it is to be brief. Normally the information within the META tags would not be sent :-). Since most of us only want to read new information, I shall send the rest of my replies PRIVATELY to David. Please do the same unless you have a NEW point to make. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 16:54:40 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: External Entity Start and End (question 2 of 10) In-Reply-To: <199801031504.KAA00633@unready.microstar.com> Message-ID: <3.0.1.16.19980103174508.6a4f549c@pop3.demon.co.uk> >Should SAX generate events for the start and end of an external >entity? > Yes. I use AElfred.resolveEntity() to make a table of entities actually used. I haven't yet done anything with it, but I think it could be very useful. > public void startEntity (String ename, String systemID); > public void endEntity (String ename); I don't mind whether you use start/end or resolve. > >Are public IDs important enough to be included? No. Personally I think PubIDs will only be used by 0.1% of the XML community and will generate far more problems that they solve. There has been a lot of debate about these already with opinions split both ways. PublicIDs are useful when someone keeps a register of equivalences and/or when an SGML catalog is used. I don't think either will be common in XML and the work involved (including the documentation) is not worth it for SAX. I have posted this publicly because I think I have something additional to contribute. Some people will disagree with me (about PubIDs). If so, make a BRIEF case as to why PubIDs **are essential for SAX**, but do NOT repeat the argument if someone else has made it. REMEMBER DavidM has 10 subjects to read and cannot go into the intricacies of every one. Do NOT let the PubID/SysID discussion get out of hand :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 16:56:07 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: towards a solution In-Reply-To: <199801031530.KAA00775@unready.microstar.com> References: <3.0.1.16.19980103160221.2cefbaca@pop3.demon.co.uk> <199801031251.HAA00419@unready.microstar.com> <3.0.1.16.19980103160221.2cefbaca@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980103173227.0d8f734a@pop3.demon.co.uk> At 10:30 03/01/98 -0500, David Megginson wrote: [...] >I cannot speak for Tim, but I would guess that he's probably busy >enough with the main WG that he wouldn't want to spend any more time >on the phone. I think that SAX is a small enough topic that we should >be able to manage it through our existing channels of communication; >of course, the opinions of the parser writers (Tim, Norbert, Chris, >me, and any others) will carry proportionally heavier weight, as will >the opinons of potential SAX implementors (such as Peter), and all >opinions count. I will collect opinions, then will construct a trial >SAX and try to get acceptance. I am completely happy with this. [...] > >There is an OO add-on to Tcl, called something like iTcl (it's been a [I looked at iTcl some time (? 2-3 years ago) - problem was that - like many tcl extensions you had to link it in at compile time. Got slightly too complicated, so I moved to Java.] >while); in any case, I'd like to keep SAX as an object-oriented API. OK. I concur with this. ************************** RESPONDING - IMPORTANT! ************************** When responding, keep your replies AS SMALL AS POSSIBLE. Do NOT quote material unless it's essential to identify what you are replying to. Do NOT change the subject (or the subject line). Typical answer (hypothetical numbering) might be: 1.1 Yes 1.2 Yes, but only if foo is replaced by bar. 1.3 No. I feel this is unworkable. I am a developer/author/salesperson and this will [...have the following consequences - BRIEF...]. 1.4 Yes, with detailed modifications. I have prepared an example showing this in: "http://some/where/foobar.html" Anyone mindlessly writing "I agree" and then quoting the whole message including duplicate or triplicate XML-DEV sigs, will get thrown a xml.dev.NotVeryConsideratePostingException which *may* be declared 'private' originally, but later occurrences may make this 'public'. ***************************************************************************** If you have nothing NEW to say (other than your vote) I strongly suggest you DO NOT POST TO THE LIST, but only to DavidM. (We could get several hundred postings on this - 10 subjects, lots of interest.) If some one has said publicly "I think X is a good idea", David (and the rest of us will note it and take it seriously.) Include X in your private mail, but don't just re-iterate the same idea to the list. ***************************************************************************** P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sat Jan 3 16:57:04 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) In-Reply-To: <199801031251.HAA00419@unready.microstar.com> Message-ID: <16552194814198@pragmaticainc.com> David Megginson wrote [04:51 AM 1/3/98 ]: >We have had an interesting discussion about SAX ("simple API for XML"? >I cannot remember) the past few weeks, and now it's time to get >specific. Excellent. It's good to see someone grab this and take it forward. >[clip] > >ASSUMPTIONS [clip] >I am also assuming that we will provide not only a callback interface, >but also an (optional) base class with stub methods that implementors >can override as needed; that means that novice users will not have to >implement all of SAX, even if we do end up with nine or ten methods. This worries me. My interest is in implementations of SAX-clients in C++. Will I have, as part of somebody's SAX implementation that I'm using, this (optional) base class available to me too? How about people working in other languages (somebody mentioned tcl, for example)? I'd assume not. Clearly one can say that this base class *is* optional and its absence won't prevent anyone from using a SAX-implementing parser. The trouble with this argument is that the availability of the base class -- and code in that class which is more than just empty stubs -- is coming into play in the pro/con discussions about the design issues. I'll cite a few examples: >From question 1: >Furthermore, if a SAX-based application >extends the XmlAppBase base class, then it can simply ignore these if >they are not needed. >From question 2: >if an application is derived from XmlAppBase, it can simply ignore > these events if it doesn't need them; and: >We could simplify things further for most users if the XmlAppBase >class ***implemented final versions of these handlers and maintained its >own entity stack*** [emphasis mine - davido], providing an additional getCurrentEntity() query >method (not part of the interface). And from question 3: >could be implemented in XmlAppBase, so that most users could simply > ignore it (the default implementation would always return the > systemID argument unmodified); So, my concern is that meaningful use of SAX implementations would come to depend, in practice, on the presence of this base class. And that's trouble for non-Java SAX clients. So, my vote is that we assume that there is *not* a presence of base class. Clearly this is a question of goals for SAX. As someone who hopes to use SAX-based parsers (and who may implement one), I can say that I may be discouraged from using/implementing them if it's hard outside of Java. On the other hand, it may be that the collective goals of *those who are actually doing the implementation work today* are satisfied by this design. And I'm a great believer in yielding to those who are actually doing the work. :>) What say y'all? ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 17:37:19 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) In-Reply-To: <16552194814198@pragmaticainc.com> References: <199801031251.HAA00419@unready.microstar.com> <16552194814198@pragmaticainc.com> Message-ID: <199801031732.MAA00275@unready.microstar.com> David Ornstein writes: > >I am also assuming that we will provide not only a callback interface, > >but also an (optional) base class with stub methods that implementors > >can override as needed; that means that novice users will not have to > >implement all of SAX, even if we do end up with nine or ten methods. > > This worries me. My interest is in implementations of SAX-clients in C++. > Will I have, as part of somebody's SAX implementation that I'm using, this > (optional) base class available to me too? How about people working in > other languages (somebody mentioned tcl, for example)? I'd assume not. Thank you for your feedback. Right now, I am proposing SAX as two core interfaces (one for the parser and one for the user event handlers), together with an optional base class. Some OO languages do not support interfaces, in which case the interfaces themselves will have to be implemented as an abstract base classes. I'm afraid that I do not understand why would it be difficult to implement the XmlAppBase base class in, say, C++, Perl5, or iTcl as well as Java? I am certainly not depending on any Java-specific behaviour in it (there is no dynamic type checking or class loading). My goal is to design SAX to work in any OO language. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 17:55:28 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Error Reporting (question 4 of 10) Message-ID: <199801031750.MAA00346@unready.microstar.com> Should SAX treat errors as events? If so, should it distinguish fatal errors from warnings or non-fatal errors? public void warning (String message, String systemID, int line); public void fatalError (String message, String systemID, int line); There is an important distinction here: warnings are strictly informative, and may be useful for debugging, i.e. WARNING: assuming UTF-8 encoding. Normally, these would go to STDERR or to a log, and people would view them only if they were trying to track down a problem. Fatal errors, on the other hand, should ordinarily stop processing and fling themselves in the user's face: FATAL ERROR: tag mismatch: end tag "foo" for element "bar" Normally, these would throw an error or exception, and the programmer might want to display the message in a dialog box. CON --- - these methods will make SAX slightly larger; - there is no XML requirement to report non-fatal errors at all; - Java already has throwable errors and exceptions, which provide a more elegant method for error reporting. PRO --- - some OO languages do not have throwable errors and exceptions; this approach should be simple enough to work with all of them; - exceptions are not always a good idea, since it might not be possible to restart the parser when one is caught; - it is useful to be able to warn the user about possible problems without causing a fatal error (hence the warning method); - there can be default implementations in XmlAppBase, so users can ignore these if they wish. MY RECOMMENDATION ----------------- Yes to both. We can have the following default implementations in XmlAppBase: public void warning (String message, String systemID, int line) { System.err.println("WARNING (" + systemID + ',' + line + "): " + message); } public void fatalError (String message, String systemID, int line) { throw new Error (systemID + ',' + line + ": " + message); } These would be appropriate for most simple applications, but could easily be overridden for more complicated ones derived form XmlAppBase. FURTHER CONSIDERATIONS ---------------------- If we decide to implement startEntity() and endEntity(), then the systemID argument to these methods will be redundant (the current URI will always be known); in that case, should we still leave the systemID argument in for convenience? Do we need a 'column' argument as well as a 'line' argument, or is 'line' enough? I don't know if all parsers track the current column, but we could define a behaviour for those that do not (such as reporting the column as -1). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 18:07:01 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Whitespace Handling (question 5 of 10) Message-ID: <199801031802.NAA00401@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX allow DTD-driven parsers to distinguish ignorable whitespace from other character data? public void ignorableWhitespace (char ch[], int length); (We have already had some discussion on this topic.) CON --- - this method would make SAX slightly larger; - parsers that use the DTD will return different results than parsers that do not (though it would be trivial to map the two); - the concept of ignorable whitespace can be confusing for non-specialists. PRO --- - the PR requires "validating" parsers to flag ignorable whitespace for the application; - there would be no need to implement anything here for most applications; - whitespace in element content is almost never significant for formatting or database applications (if it were significant, then the element type would have mixed content). MY RECOMMENDATION ----------------- Qualified no. As someone who has worked with SGML for many years, I would rather not see the ignorable whitespace at all; however, the PR requires parsers to report all whitespace. Tim Bray's recent comments on this list imply that a validating parser using SAX could report ignorable whitespace as regular character data and still be conforming; if I have inferred correctly, then I am willing to omit this callback. OTHER CONSIDERATIONS -------------------- It would also be possible to implement this in the charData callback itself: public void charData (char ch[], int length, boolean isIgnorable); However, given that charData will probably be the most heavily-implemented handler, and that very few applications will care about ignorable whitespace, I would prefer not to complicate things unnecessarily. If we need to distinguish it to be conforming, then ignorable whitespace should probably be shuffled off to its own callback, to make it easier to ignore. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 18:16:41 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Processing Instructions (question 6 of 10) Message-ID: <199801031812.NAA00440@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX implement a callback for processing instructions? public void processingInstruction (String name, String data); (In XML, all processing instructions begin with a name, followed optionally by whitespace and then other characters). CON --- - this method would make SAX slightly larger; - processing instructions are difficult for non-specialists to understand -- the HTML world never managed to figure them out, and used specialised comments instead; - processing instructions are not part of a document's logical element structure, and providing a callback event might encourage people to abuse them. PRO --- - processing instructions are required for most proposed XML-related standards, including XSL and architectural forms; - novice users can extend the XmlAppBase class and simply ignore processing instructions altogether. MY RECOMMENDATION ----------------- Yes. Omitting PI's would cripple SAX beyond any but the most trivial uses. PI's provide the only way to make new types of declarations in XML, and nearly every XML-related standard requires at least one special enabling declaration (to locate a stylesheet, declare a namespace, etc.). OTHER CONSIDERATIONS -------------------- XML processing instructions consist of two parts: a name, and (optional) data. It makes sense to report the two of these separately, since the name can be an internalized string and may be the name of a declared notation. Should the data (if present) be a string, or should it be an array of characters on the model of charData? public void processingInstruction (String name, char data[], int length) On a different point, I am assuming that SAX-based parsers will not report the XML declaration or text declaration using this event, even though they resemble PIs. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 18:25:29 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Comments (question 7 of 10) Message-ID: <199801031821.NAA00477@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX include an event for comments? public void comment (char ch[], int length); CON --- - comments are purely lexical, and do not form an integral part of a document's information set for processing (as opposed to authoring or archiving); - including comments in SAX might encourage comment abuses like those common in HTML (i.e. enclosing 800 lines of JavaScript in a comment). PRO --- - the DOM includes comments in the core level-one implementation; - HyTime uses comments for lexical constraints. MY RECOMMENDATION ----------------- No. SAX is not designed for authoring tools or repositories, that need to preserve the lexical as well as the logical structure of a document, and there is no compelling reason to report comments here except for DOM building (and we can always leave them out of a DOM). The conventional comments in HyTime are not required, and personally, I believe that they should not have been there in the first place. FURTHER CONSIDERATIONS ---------------------- Another lexical feature that I am not discussing here is CDATA sections; I assume that, when the parser is reporting character data, it does not matter how the parser obtained those characters (in a CDATA section, or in regular #PCDATA with the delimiters escaped using references). I am happy, of course, to listen to other opinions on this subject. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sat Jan 3 18:28:23 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) In-Reply-To: <199801031732.MAA00275@unready.microstar.com> References: <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <16552194814198@pragmaticainc.com> Message-ID: <18271237114262@pragmaticainc.com> You wrote [09:32 AM 1/3/98 ]: >David Ornstein writes: > > > >I am also assuming that we will provide not only a callback interface, > > >but also an (optional) base class with stub methods that implementors > > >can override as needed; that means that novice users will not have to > > >implement all of SAX, even if we do end up with nine or ten methods. > > > > This worries me. My interest is in implementations of SAX-clients in C++. > > Will I have, as part of somebody's SAX implementation that I'm using, this > > (optional) base class available to me too? How about people working in > > other languages (somebody mentioned tcl, for example)? I'd assume not. > >Thank you for your feedback. Right now, I am proposing SAX as two >core interfaces (one for the parser and one for the user event >handlers), Good. Sounds right. >together with an optional base class. Some OO languages do >not support interfaces, in which case the interfaces themselves will >have to be implemented as an abstract base classes. Clearly. In C++ I use interfaces all the time and think they are essential to building good systems. >I'm afraid that I do not understand why would it be difficult to >implement the XmlAppBase base class in, say, C++, Perl5, or iTcl as >well as Java? I am certainly not depending on any Java-specific >behaviour in it (there is no dynamic type checking or class loading). >My goal is to design SAX to work in any OO language. Good. I think that's the right goal (though I suppose some others could argue for C). My point is not that implementing the base class would be hard in the other languages. It probably wouldn't be (depending, of course, on how fat it gets). My point is about a relationship: as the usefulness of the base class climbs towards necessity, the probability of people using SAX-implementing parsers *that don't come with the base class supplied* declines. This is only important iff the design of the API is influenced by the assumption of the presence of the base class. Some of the "design issue" posts seemed to me to be heading in that direction. If we divide the world into SAX implementors and SAX clients, I think that the base class is a useful thing for *clients* to build and use; it's how I'd do it. As such, I think it probably doesn't belong on the implementor side of the line. David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 18:44:28 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: do we want a base class In-Reply-To: <18271237114262@pragmaticainc.com> References: <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <199801031732.MAA00275@unready.microstar.com> <18271237114262@pragmaticainc.com> Message-ID: <199801031839.NAA00562@unready.microstar.com> David Ornstein writes: > My point is about a relationship: as the usefulness of the base > class climbs towards necessity, the probability of people using > SAX-implementing parsers *that don't come with the base class > supplied* declines. This is only important iff the design of the > API is influenced by the assumption of the presence of the base > class. Some of the "design issue" posts seemed to me to be heading > in that direction. If we divide the world into SAX implementors > and SAX clients, I think that the base class is a useful thing for > *clients* to build and use; it's how I'd do it. As such, I think > it probably doesn't belong on the implementor side of the line. I should clarify: the SAX interfaces and the XmlAppBase base class will be written only once for each programming language, and they will live in their own package (in languages that use packages). XmlAppBase will depend only on the interfaces for information -- in other words, there will be no such thing as a SAX-implementing parser without the base class (parser writers need not be concerned with the base class at all). In Java, for example, there would be three class files in the SAX package (I'll deal with naming and packaging in a later posting): XmlParser.class The interface for a SAX-aware parser XmlApplication.class The interface for an object with event handlers. XmlAppBase.class A base class for an object with event handlers. All Java-based XML parsers that use SAX will refer to this package, and all will have these three class files available in the same location. The same would apply to C++ headers, etc. I expect that in Java XmlAppBase.class will weigh in at under 2K (perhaps far under). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sat Jan 3 18:46:31 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <199801031750.MAA00346@unready.microstar.com> Message-ID: <18451229414269@pragmaticainc.com> David Megginson wrote [09:50 AM 1/3/98 ]: >Should SAX treat errors as events? If so, should it distinguish fatal >errors from warnings or non-fatal errors? Yes. Yes. Details below for those who care... >There is an important distinction here: warnings are strictly >informative, and may be useful for debugging, [clip] >Normally, these would go to STDERR or to a log, and people would view >them only if they were trying to track down a problem. Fatal errors, >on the other hand, should ordinarily stop processing and fling >themselves in the user's face: [clip] >Normally, these would throw an error or exception, and the programmer >might want to display the message in a dialog box. We should not make decisions a priori for Clients. Which messages the Client might or might not want to show to users (or otherwise act on), should not be pre-determined in the interface. >CON >- Java already has throwable errors and exceptions, which provide a > more elegant method for error reporting. But, as far as I know, there are no (good?) cross-language exceptions handling mechanisms. How would a C++ or PERL client catch a Java exception? >PRO >- some OO languages do not have throwable errors and exceptions; this > approach should be simple enough to work with all of them; Right. >- there can be default implementations in XmlAppBase, so users can > ignore these if they wish. As mentioned in the thread "do we want a base class," I propose that we not assume the presence of an XmlAppBase beast when formulating the PRO/CON lists. >MY RECOMMENDATION >----------------- > >Yes to both. > >We can have the following default implementations in XmlAppBase: [clip] Good for Clients, not for Implementors, for reasons stated previously. >FURTHER CONSIDERATIONS >---------------------- [clip] >Do we need a 'column' argument as well as a 'line' argument, or is >'line' enough? I don't know if all parsers track the current column, >but we could define a behaviour for those that do not (such as >reporting the column as -1). I'd really like to see the column information. I'd also like to have an offset from the start of the stream. Somebody (Tim? James?) said that their parser did all three. I've often needed all of them. This one isn't a killer, though, and your defined behavior for non implementors is a fair tradeoff. David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sat Jan 3 18:58:53 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:44 2004 Subject: SAX: do we want a base class In-Reply-To: <199801031839.NAA00562@unready.microstar.com> References: <18271237114262@pragmaticainc.com> <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <199801031732.MAA00275@unready.microstar.com> <18271237114262@pragmaticainc.com> Message-ID: <18574174214275@pragmaticainc.com> David Megginson wrote [10:39 AM 1/3/98 ]: >I should clarify: the SAX interfaces and the XmlAppBase base class >will be written only once for each programming language, and they will >live in their own package (in languages that use packages). >XmlAppBase will depend only on the interfaces for information -- in >other words, there will be no such thing as a SAX-implementing parser >without the base class (parser writers need not be concerned with the >base class at all). Ahh. I was assuming that we were aiming to be able to have the SAX Client and SAX Implemention be in different languages. The practical case I'm thinking about is that I write in C++ and would like to be able to use the Java-based Implementations. Assuming a platform (for example win32) that provides interoperability at the interface level between multiple languages, or really any language-neutral IDL-based mechanism, shouldn't that be possible? I guess we should answer that question first. If we say that, yes, we want cross-language interoperability (and IDL?), then I maintain my stand. If not, I quickly retract... David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 19:29:58 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Whitespace Handling (question 5 of 10) In-Reply-To: <199801031802.NAA00401@unready.microstar.com> Message-ID: <3.0.1.16.19980103201455.2a7f67fc@pop3.demon.co.uk> At 13:02 03/01/98 -0500, David Megginson wrote: > >Should SAX allow DTD-driven parsers to distinguish ignorable >whitespace from other character data? > NO. IMO the XML community has not had enough experience of ignorable whitespace to agree on how to treat it. It can only be ignored if enough DTD-related information is given. I believe that anyone who needs to deal with it is probably going to have to use this themselves. In time I very much hope that we *shall* come up with protocols for this. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 19:33:34 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <199801031750.MAA00346@unready.microstar.com> Message-ID: <3.0.1.16.19980103201006.2db74a3c@pop3.demon.co.uk> Yes to both. There are at least two sorts of errors: - XML-based errors (i.e. VC or WF violations) - system errors (e.g. errors associated with SystemID, InterruptedException, etc. I'm not we should do with these. (for FileNotFound AElfred returns an error count of 1 (one) but doesn't trap the exception in doError. I don't know enough about interfaces to know "who should be in charge" and I'll leave it to you-all to sort this out. > >Do we need a 'column' argument as well as a 'line' argument, or is >'line' enough? I don't know if all parsers track the current column, >but we could define a behaviour for those that do not (such as >reporting the column as -1). It's valuable, so one can highlight the actual character where the error occurred. (In AElfred at present I just highlight the line, which if it's very long could be difficult - remember that XML files can consist of a single line - and some *will* :-)). P. BTW it is very easy to say "yes" to everything. That will result in a larger interface. DavidM will have to take some tough decisions. Therefore it may be useful to qualify your "Yeses" and "nos", like: YES - can't live without this YES - but not essential NO - but don't mind NO - over my dead body The perceptive will have noticed that I didn't post a reply to (3) to the list - it went straight to DavidM. (Also, where there are two correspondents with the same first/given name, using initials is an acceptable way of distinguishing. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 3 19:35:21 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Processing Instructions (question 6 of 10) In-Reply-To: <199801031812.NAA00440@unready.microstar.com> Message-ID: <3.0.1.16.19980103202029.2a7f3c1c@pop3.demon.co.uk> At 13:12 03/01/98 -0500, David Megginson wrote: > >Should SAX implement a callback for processing instructions? YES - essential >- processing instructions are required for most proposed XML-related > standards, including XSL and architectural forms; It seems likely that they will also be used in namespace declarations. Note, however that there is no proposal for the internal representation of PIs other than the two components mentioned by DavidM. I support returning both. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 3 21:24:17 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <18451229414269@pragmaticainc.com> References: <199801031750.MAA00346@unready.microstar.com> <18451229414269@pragmaticainc.com> Message-ID: <199801032119.QAA00268@unready.microstar.com> David Ornstein writes: > I'd really like to see the column information. I'd also like to have an > offset from the start of the stream. Somebody (Tim? James?) said that > their parser did all three. I've often needed all of them. Thanks for the reply. Do you want the offset in octets or characters? Characters would probably be manageable, but octets would be trickier for people to implement (especially for UTF-8 or UTF-16). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sat Jan 3 21:39:24 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <199801032119.QAA00268@unready.microstar.com> References: <18451229414269@pragmaticainc.com> <199801031750.MAA00346@unready.microstar.com> <18451229414269@pragmaticainc.com> Message-ID: <21381216014359@pragmaticainc.com> David Megginson wrote [01:19 PM 1/3/98 ]: >David Ornstein writes: > > > I'd really like to see the column information. I'd also like to have an > > offset from the start of the stream. Somebody (Tim? James?) said that > > their parser did all three. I've often needed all of them. > >Thanks for the reply. Do you want the offset in octets or characters? >Characters would probably be manageable, but octets would be trickier >for people to implement (especially for UTF-8 or UTF-16). I suspect octet offsets are more useful so one can do pointer math. Clearly one can transform between offsets and line/column positions with a table indexed by line number, so this could be on either the Client side or the Implementation side. If we're going to go to the trouble of adding this to SAX (my vote: it's nice but I can live without), I'd do the thing that's nicest for the Client. Thanks, David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:30:52 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> Message-ID: <34AED033.651B87E8@jclark.com> David Megginson wrote: > > Should SAX treat errors as events? If so, should it distinguish fatal > errors from warnings or non-fatal errors? > > public void warning (String message, String systemID, int line); > public void fatalError (String message, String systemID, int line); There are 3 things that need to be distinguished: - Fatal errors: violations of well-formedness constraints; processors have to detect these and must stop normal processing. In Java I think an exception is the only reasonable way to handle these. Unless you do this people will be tempted to continue processing in violation of the spec. - Errors: violations of validity constraints or other errors that a processor is not required to detect. - Warnings: messages about conditions that do not cause a document to be non-conforming. To do a really good job of reporting these, much more information is needed that a line number and a URL. In particular you need more information about the entity structure. For example, given a document doc.xml: &e; and doc.dtd: "> nsgmls -e will say: In entity e included from doc.xml:2:8 nsgmls:doc.dtd:2:15:E: element "b" undefined There can be more than one relevant URL and line number. This sort of facility also introduces major internationalization issues. I can guarantee you are not going to be able to do a good job of error reporting and still keep the interface simple. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:30:58 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) References: <16552194814198@pragmaticainc.com> Message-ID: <34AECAE1.2A38CF1E@jclark.com> David Ornstein wrote: > >I am also assuming that we will provide not only a callback interface, > >but also an (optional) base class with stub methods that implementors > >can override as needed; that means that novice users will not have to > >implement all of SAX, even if we do end up with nine or ten methods. > > This worries me. My interest is in implementations of SAX-clients in C++. > Will I have, as part of somebody's SAX implementation that I'm using, this > (optional) base class available to me too? In C++ I can't see any need for a base class separate from the interface. You can just have a single class which provides empty definitions for all virtual functions. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:31:06 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: External Entity Resolution (question 3 of 10) References: <199801031522.KAA00737@unready.microstar.com> Message-ID: <34AED316.E360E0D7@jclark.com> David Megginson wrote: > Should SAX provide a handler for resolving external entities? > > public String resolveEntity (String ename, String publicID, String systemID); I don't think bundling this into the application base class is a good idea. I would distinguish the following capabilities: 1. Providing information to the application about entity references. This would be appropriate to include XmlAppBase class if it was felt important enough. 2. Allowing public ids to be mapped to URLs. 3. Allowing URLs to be remapped. I don't think 2 and 3 should be bundled into the XmlAppBase interface. Rather they should be provided as a separate XmlEntityManager interface, because I will often want to use the same XmlEntityManager implementation with many apps, and I may want to use different XmlEntityManager implementations with the same app. The XmlEntityManager I want depends on how the document is stored, and that is independent of the document's logical structure which is what will for the most part drive the application's processing. For example, I might want to provide an XmlEntityManager implementation that implements RFC 2110 and allows all the entities of an XML document to be combined into a single multipart/related MIME body. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:31:32 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: towards a solution References: <199801031251.HAA00419@unready.microstar.com> Message-ID: <34AED641.164786BC@jclark.com> David Megginson wrote: > Before we start, I am assuming that we will all accept the following > three events without further discussion, since no one objected last > time: > > startElement (String name, java.util.Dictionary attributes) I don't think using java.util.Dictionary is a good idea: 1. JDK 1.2 provides a new Map interface which replaces Dictionary. 2. java.util.Dictionary is an abstract base class not an interface. 3. java.util.dictionary is weakly typed: it doesn't enforce the requirement that keys be strings, and it requires values to be cast to strings. I think it would be much better to have an Attributes interface and also a convenience adapter class that provides a Dictionary implementation in terms of that interface. > endElement (String name) No problem with that. > charData (char ch[], int length) I think there should be an offset argument as well. Most of the Java String operations that operate on a subarray take 3 arguments: char array, offset and count. If you don't have the offset argument you will be requiring some implementations to do additional copying which has a significant performance cost. > I am also assuming that we will provide not only a callback interface, > but also an (optional) base class with stub methods that implementors > can override as needed; that means that novice users will not have to > implement all of SAX, even if we do end up with nine or ten methods. I agree we should provide these. JDK 1.1 does this extensively in AWT: it calls the base classes Adapters. I think we should follow this terminology. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:52:40 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Comments (question 7 of 10) References: <199801031821.NAA00477@unready.microstar.com> Message-ID: <34AED935.EA726BF3@jclark.com> David Megginson wrote: > Should SAX include an event for comments? > > public void comment (char ch[], int length); No. This is appropriate only for editor type applications, which also typically need to be able to preserve entity structure. Unless SAX also provides enough information to support preservation of the entity structure, it shouldn't provide information about comments. Providing adequate information about the entity structure would prevent SAX from being simple. A startEntity and endEntity event is far from adequate for this: consider internal entity references in attribute values for example. > PRO > --- > > - the DOM includes comments in the core level-one implementation; My understanding is that the DOM is also going to provide full information about the entity structure to support editor type applications. > - HyTime uses comments for lexical constraints. I don't think HyTime 2 puts information intended for machine processing in comments any longer. Anyway these comments were inside markup declarations, and so would not be allowed in XML. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 00:52:55 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Whitespace Handling (question 5 of 10) References: <199801031802.NAA00401@unready.microstar.com> Message-ID: <34AEDA3A.DE18BDF7@jclark.com> David Megginson wrote: > Should SAX allow DTD-driven parsers to distinguish ignorable > whitespace from other character data? > > public void ignorableWhitespace (char ch[], int length); Tough call. (If we have it, it needs an offset argument as well.) If we do this, then the stub (adapter) class should do void ignorableWhitespace(char ch[], int off, int count) { charData(ch, off, count); } This means that users who don't care about the difference can just ignore it. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 01:03:33 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: towards a solution In-Reply-To: <34AED641.164786BC@jclark.com> References: <199801031251.HAA00419@unready.microstar.com> <34AED641.164786BC@jclark.com> Message-ID: <199801040058.TAA00501@unready.microstar.com> James Clark writes: > I don't think using java.util.Dictionary is a good idea: > > 1. JDK 1.2 provides a new Map interface which replaces Dictionary. > > 2. java.util.Dictionary is an abstract base class not an interface. > > 3. java.util.dictionary is weakly typed: it doesn't enforce the > requirement that keys be strings, and it requires values to be cast to > strings. > > I think it would be much better to have an Attributes interface and also > a convenience adapter class that provides a Dictionary implementation in > terms of that interface. I would like to avoid java.util.Map to keep SAX applet-friendly (it will be years before most browsers deployed support even 1.1). I agree that Dictionary is far less than ideal -- what do you imagine the attributes interface looking like? > > charData (char ch[], int length) > > I think there should be an offset argument as well. Most of the Java > String operations that operate on a subarray take 3 arguments: char > array, offset and count. Agreed. I will change it to charData (char ch[], int start, int length); > > I am also assuming that we will provide not only a callback interface, > > but also an (optional) base class with stub methods that implementors > > can override as needed; that means that novice users will not have to > > implement all of SAX, even if we do end up with nine or ten methods. > > I agree we should provide these. JDK 1.1 does this extensively in AWT: > it calls the base classes Adapters. I think we should follow this > terminology. Will the terminology translate well to other OO languages? If so, then I will be happy to use it. All the best, and thank you for the comments, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 01:11:04 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <34AED033.651B87E8@jclark.com> References: <199801031750.MAA00346@unready.microstar.com> <34AED033.651B87E8@jclark.com> Message-ID: <199801040106.UAA00530@unready.microstar.com> James Clark writes: > There are 3 things that need to be distinguished: > > - Fatal errors: violations of well-formedness constraints; processors > have to detect these and must stop normal processing. In Java I think > an exception is the only reasonable way to handle these. Unless you do > this people will be tempted to continue processing in violation of the > spec. > > - Errors: violations of validity constraints or other errors that a > processor is not required to detect. > > - Warnings: messages about conditions that do not cause a document to be > non-conforming. Perhaps, then, we should have three callbacks. I don't agree that we should automatically use exceptions for fatal errors, because sometimes it will be useful to try to report more than one error at once -- the Java XmlAppBase class will throw an Error by default for fatalError, however. > To do a really good job of reporting these, much more information is > needed that a line number and a URL. In particular you need more > information about the entity structure. [...] > I can guarantee you are not going to be able to do a good job of error > reporting and still keep the interface simple. Agreed: good error reporting is out of scope. If the startEntity and endEntity events survive in SAX, then at least some information on the entity structure will be available; furthermore, individual implementations can stuff extra contextual information into the 'message' argument if they wish. In the end, we are not looking to provide the kind of detailed error reporting that NSGMLS does -- SAX is a production interface, not an authoring one, and needs only give a very rough indication of why it is giving up. Normally, then, the author should turn to a validating parser for full debugging support. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 01:21:41 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Prolog (question 8 of 10) Message-ID: <199801040117.UAA00580@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] Should SAX include events for the start and end of the prolog, and/or for the DOCTYPE declaration? public void startProlog (); public void endProlog (); public void docType (String name, String systemID); CON --- - these methods would make SAX slightly larger; - these are pretty far out of scope -- the start and end of the prolog can easily be inferred, and a new DOCTYPE can be constructed if needed. PRO --- - the DOM has a Doctype class; - it would be nice to have a container of some sort around any PI's and entity start/ends found in the prolog, if we decide to include this information - it can be useful to know what external DTD the document is using, possibly so that the DTD's URI can be passed off to a different process. MY RECOMMENDATION ----------------- No. While this information would be marginally useful, it is really designed for transformation applications, and SAX does not give enough information for really useful transformations otherwise (James has pointed out, for example, the lack of information about internal entity boundaries). OTHER CONSIDERATIONS -------------------- If some implementations provide additional DTD-related functionality, it would be nice to know when the prolog is finished and the DTD (if present) is fully constructed. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 01:36:55 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Parser Interface (question 9 of 10) Message-ID: <199801040132.UAA00644@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is one in a series of ten design questions that we need to answer to implement the API.] So far, all of my questions have related to the callback interface (XmlApplication); should SAX also include a simple interface for the parser/processor itself, like the following? public interface XmlParser extends java.lang.Runnable { public void setDocumentSystemID (String systemID); public void setXmlApplication (XmlApplication app); public void run (); } (We would also mandate a zero-argument constructor, but there is no way to enforce that with the interface.) Implementations might also provide additional convenience constructors to set the systemID and app directly. CONS ---- - this interface constrains the parser writers to use the same invocation methods; - it will be necessary to write separate frontends for some parsers to implement this functionality; - extending java.lang.Runnable is Java-specific, and the concept may not translate well to other languages. PROS ---- - it will be simple to substitute parsers dynamically; - it will be simple to start a parser as a new thread; - most users will not have to know the details of different parser implementations. MY RECOMMENDATION ----------------- Yes (strongly). In Java, this approach will be especially powerful, because applications can allow users to choose their own parsers at run time (even if the application writer has no knowledge of that parser). Assume, for example, that the variable 'parserOfChoice' contains the class name of a user's preferred parser: Class parserClass = Class.forName(parserOfChoice); XmlParser parser = (XmlParser)parserClass.newInstance(); parser.setDocumentSystemID("http://www.foo.com/test.xml"); parser.setXmlApplication(myApp); parser.run(); Note that the zero-argument constructor is required for this sort of approach. It will also be trivial to start the parser as a thread: Thread parserThread = new Thread(parser); parserThread.start(); OTHER CONSIDERATIONS -------------------- It might be useful to have corresponding getDocumentSystemID and getXmlApplication methods, but since they are not absolutely essential, it might not make sense to mandate them in the interface. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 01:51:59 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Naming and Packaging (question 10 of 10) Message-ID: <199801040147.UAA00705@unready.microstar.com> [SAX is a proposal for a simple, event-based XML API, using callbacks. This is the last in a series of ten design questions that we need to answer to implement the API.] What package should the Java SAX belong to, and what should the classes be named? PACKAGE ------- Ideally, it would be nice to use the 'w3c' package for SAX, but that is probably not practical in the short term, at least. Alternatively, a relevant 'org' domain would be nice (anyone bought xml.org from Internic yet?) if anyone could offer one. For the sake of expediency, I am willing to use Microstar's namespace, package com.microstar.sax; as an interim location, until we can come up with something better (i.e. a non-"COM" namespace). Microstar's Ælfred parser already lives in the package 'com.microstar.xml'. It is essential that we have unlimited control over any namespace that we use, and that the package name be neutral enough that those programmers with strong allegiances to the S*n, M*******t, and N******e armed camps all feel comfortable using it. This issue affects only Java. CLASSES ------- For now, I am proposing the following package names (possibly substituting something else for 'com.microstar.sax'): com.microstar.sax.XmlParser the XML parser interface com.microstar.sax.XmlApplication the XML application interface com.microstar.sax.XmlAppBase the application base class, or adaptor. Other possible names include "XmlProcessor" instead of "XmlParser" (somewhat confusing), or "XmlAdaptor" instead of "XmlAppBase" (possibly too Java-specific). This issue affects implementations of SAX in all programming languages. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:14:33 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Document Start and End (question 1 of 10) References: <199801031447.JAA00628@unready.microstar.com> Message-ID: <34AEE1DC.B262B0D5@technologist.com> David Megginson wrote: > These events will make code using SAX much simpler and cleaner, and > they come at a very low cost. Furthermore, if a SAX-based application > extends the XmlAppBase base class, then it can simply ignore these if > they are not needed. I agree, and for those of us who may not use XmlAppBase, it is probably better to say: "it is very easy to ignore these events if they are not needed." In other words, it is easy whether or not you are using XmlAppBase. In any language supplying an empty callback is easy. > public void startDocument (String publicID, String systemID); > public void endDocument (int errors, int warnings); > > The latter, however, is very easy to track, and the former can be > supplied to the constructor when the SAX event-handler is created, so > both are redundant (if slightly convenient). I would say that the extra information is not useful enough. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:15:22 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: External Entity Start and End (question 2 of 10) References: <199801031504.KAA00633@unready.microstar.com> Message-ID: <34AEE33B.5962D7A7@technologist.com> David Megginson wrote: > OTHER CONSIDERATIONS > -------------------- > > Are public IDs important enough to be included? > > public void startExternalEntity (String ename, String publicID, > String systemID) XML supports two kinds of external identifiers and I think that an interface that supports them both is actually simpler than one that does not -- in the sense that it mirrors the language structure better. If we leave publid identifiers out, then users must guess at our reason for doing so. Being different from XML is non-intuitive. > We could simplify things further for most users if the XmlAppBase > class implemented final versions of these handlers and maintained its > own entity stack, providing an additional getCurrentEntity() query > method (not part of the interface). We could replace "getCurrentEntity()" with "getCurrentLocation(Entity entity, int linenum, int colnum, int offset)" and kill two birds with one stone. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:15:38 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) References: <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <16552194814198@pragmaticainc.com> <18271237114262@pragmaticainc.com> Message-ID: <34AEEACE.A21D852C@technologist.com> David Ornstein wrote: > > My point is about a relationship: as the usefulness of the base class > climbs towards necessity, the probability of people using SAX-implementing > parsers *that don't come with the base class supplied* declines. This is > only important iff the design of the API is influenced by the assumption of > the presence of the base class. Some of the "design issue" posts seemed to > me to be heading in that direction. If we divide the world into SAX > implementors and SAX clients, I think that the base class is a useful thing > for *clients* to build and use; it's how I'd do it. As such, I think it > probably doesn't belong on the implementor side of the line. If SAX implementors are required to provide a base class, clients could use it through delegation or inheritance, depending on the language. If implementing a SAX client is going to depend on this base class, then we must specify its behaviour and require its existence, however. Life would be easier if we did not depend on it. One way to do this is to use the common event-handling idiom of returning "false" or "null" when you want the caller to do the job. So, for instance a "fetchEntity()" method might return NULL to indicate that the client wanted to leave it up to the SAX implementation. Yet another way to do it would be to require the registration of objects for the handling of more complex events: registerEventFetcher( EventFetcher fletch ); Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:15:36 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: External Entity Resolution (question 3 of 10) References: <199801031522.KAA00737@unready.microstar.com> Message-ID: <34AEE65B.A59F81D9@technologist.com> David Megginson wrote: > - could be implemented in XmlAppBase, so that most users could simply > ignore it (the default implementation would always return the > systemID argument unmodified); For those that don't have an XmlAppBase, the "noop" operation in this case would be: public String resolveEntity (String ename, String publicID, String systemID){ return systemID; } Of course this default behaviour cannot be specified in CORBA, but it can be specified in the SAX documentation, so I don't think that it is a big deal. > The problem is that most parsers don't support this functionality > right now, so we could not simply implement a new SAX front-end on top > of the parser's existing API. On the other hand, we could make > support for this optional, and add an entityResolutionSupported() > boolean call to the XmlParser interface (see question 9, to be posted > later). I think that we should just require support for it. It really is massively useful and trivial to implement. One comment: the parser should turn relative system identifiers into absolute ones before calling this method. The parser has information about the location of the "current" entity (parameter entity!) that the SAX application will not (since SAX provides no DTD information). Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:15:50 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> Message-ID: <34AEED7F.F082526A@technologist.com> David Megginson wrote: > > - Java already has throwable errors and exceptions, which provide a > more elegant method for error reporting. I don't think that they are very elegant this case. By the time the exception is caught, the continuation (stack and stack pointer, for non-Scheme programmers) has been lost. Continuing the parse is problematic (as you pointed out). > We can have the following default implementations in XmlAppBase: As long as the default implementations in XmlAppBase are only 1 line, then I don't think DavidO has much to worry about for implementors in other languages. > If we decide to implement startEntity() and endEntity(), then the > systemID argument to these methods will be redundant (the current URI > will always be known); in that case, should we still leave the > systemID argument in for convenience? I would prefer a getCurrentLocation() method that can be called anywhere. > Do we need a 'column' argument as well as a 'line' argument, or is > 'line' enough? I don't know if all parsers track the current column, > but we could define a behaviour for those that do not (such as > reporting the column as -1). I'm not big on that last idea. It adds complexity to the spec. and to applications. Careless implementors will report errors on line "-1" or perhaps crash. Either require it or don't. I would vote to require it. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 02:16:03 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Whitespace Handling (question 5 of 10) References: <199801031802.NAA00401@unready.microstar.com> Message-ID: <34AEEFC2.F5E6D094@technologist.com> David Megginson wrote: > > [SAX is a proposal for a simple, event-based XML API, using > callbacks. This is one in a series of ten design questions that we > need to answer to implement the API.] > > Should SAX allow DTD-driven parsers to distinguish ignorable > whitespace from other character data? > > public void ignorableWhitespace (char ch[], int length); > > - the concept of ignorable whitespace can be confusing for > non-specialists. You've mentioned this a few times, but I wonder if we are really making a spec. for people who are not familiar with XML itself. Ignorable whitespace is an unfortunate fact of life (and entities are a fortunate fact of life) and people who want to work with XML parsers should be familiar with XML concepts. All we should hide from them is the nitty gritty syntax. > Tim Bray's recent comments on this list imply that a validating parser > using SAX could report ignorable whitespace as regular character data > and still be conforming; if I have inferred correctly, then I am > willing to omit this callback. Could someone please show me where the spec. provides leeway for this sort of thing? If SAX is meant to be usable with validating parsers (e.g. parsers which report validation errors), then I feel that it should support ignorable whitespace. On the other hand, if it is only interested in the well-formedness level, then of course this is irrelevant. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sun Jan 4 02:45:26 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) In-Reply-To: <34AECAE1.2A38CF1E@jclark.com> References: <16552194814198@pragmaticainc.com> Message-ID: <02440990714538@pragmaticainc.com> James Clark wrote [03:33 PM 1/3/98 ]: >David Ornstein wrote: > >> >I am also assuming that we will provide not only a callback interface, >> >but also an (optional) base class with stub methods that implementors >> >can override as needed; that means that novice users will not have to >> >implement all of SAX, even if we do end up with nine or ten methods. >> >> This worries me. My interest is in implementations of SAX-clients in C++. >> Will I have, as part of somebody's SAX implementation that I'm using, this >> (optional) base class available to me too? > >In C++ I can't see any need for a base class separate from the >interface. You can just have a single class which provides empty >definitions for all virtual functions. If the only purpose for the base class is to provide *empty* definitions for the functions, then I agree. On the other hand, I'm seeing the base class fill up with a variety of "default implementations" for many of the functions in the interface. These "default implementations" will be non-empty and their behavior may turn out to be expected by clients. As a result, reasonable implementations will probably need/want to provide the base class. This may well end up meaning that the "default implementations" actually become a part of the SAX "spec". Witness your comment in a previous note: >If we do this, then the stub (adapter) class should do > >void ignorableWhitespace(char ch[], int off, int count) { > charData(ch, off, count); >} > >This means that users who don't care about the difference can just >ignore it. David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sun Jan 4 02:54:17 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: do we want a base class (was Re: SAX: towards a solution) In-Reply-To: <34AEEACE.A21D852C@technologist.com> References: <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <16552194814198@pragmaticainc.com> <18271237114262@pragmaticainc.com> Message-ID: <02525622414543@pragmaticainc.com> Paul Prescod wrote [05:50 PM 1/3/98 ]: >David Ornstein wrote: >> >> My point is about a relationship: as the usefulness of the base class >> climbs towards necessity, the probability of people using SAX-implementing >> parsers *that don't come with the base class supplied* declines. This is >> only important iff the design of the API is influenced by the assumption of >> the presence of the base class. ... >If SAX implementors are required to provide a base class, clients could >use it through delegation or inheritance, depending on the language. If >implementing a SAX client is going to depend on this base class, then we >must specify its behaviour and require its existence, however. Agreed. >Life would be easier if we did not depend on it. Double agreed. >One way to do this is >to use the common event-handling idiom of returning "false" or "null" >when you want the caller to do the job. So, for instance a >"fetchEntity()" method might return NULL to indicate that the client >wanted to leave it up to the SAX implementation. I think this is a good approach that solves most of the problems. Then all we must do is specify the default behavior. To do this we would get rid of the base class (leaving an empty interface only). We would then go back through the messages from the past few days and gather the places where someone said: "and the base class could just do such-and-such which can be overridden as desired" and move the such-and-such into the formally specified behavior for each event when the Client declares NoInterest. >Yet another way to do it would be to require the registration of objects >for the handling of more complex events: > >registerEventFetcher( EventFetcher fletch ); Registration schemes can be very nice and quite scalable, but I do think the extra cost is worth it. Especially given that your other proposal makes sense. David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 04:33:38 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> <34AED033.651B87E8@jclark.com> <199801040106.UAA00530@unready.microstar.com> Message-ID: <34AF0F90.83A7F720@jclark.com> David Megginson wrote: > I don't agree that we > should automatically use exceptions for fatal errors, because > sometimes it will be useful to try to report more than one error at > once -- the Java XmlAppBase class will throw an Error by default for > fatalError, however. ... > Agreed: good error reporting is out of scope. ... > In the end, we are not looking to provide the kind of detailed error > reporting that NSGMLS does -- SAX is a production interface, not an > authoring one, and needs only give a very rough indication of why it > is giving up. Normally, then, the author should turn to a validating > parser for full debugging support. If you're not aiming to provide detailed error reporting suitable for authoring, why would you need to report more than one fatal error? For a production interface, throwing an Exception should be quite sufficient. It would not be appropriate to throw an Error: errors are for things that applications should not normally try to catch. Instead we should have something like public class XmlNotWellFormedException extends java.io.IOException { private String url; private int line; public XmlNotWellFormedException(String message, String url, int line) { super(message); this.url = url; this.line = line; } public String getURL() { return url; } public int getLine() { return line; } } I feel pretty strongly that the right way to handle fatal XML errors in Java in a production-oriented interface is with an exception and that SAX needs to define an exception to cover fatal XML errors. The exception should extend IOException so that it works with the java.net.ContentHandler stuff. A parser that wants to provide more detailed information can simply derive a class from XmlNotWellFormedException, and an application can catch that and use the additional parser-specific information if it's available. Having the stub class create the exception doesn't work well because the stub class (being parser independent) can only create an XmlNotWellFormedException and not the richer parser-specific exception that extends it. A reasonable solution would be to have: void fatalError(XmlNotWellFormedException e) throws IOException; in the interface, and then void fatalError(XmlNotWellFormedException e) throws IOException { throw e; } in the stub. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Jan 4 06:22:17 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:45 2004 Subject: SAX: towards a solution References: <199801031251.HAA00419@unready.microstar.com> <34AED641.164786BC@jclark.com> <199801040058.TAA00501@unready.microstar.com> Message-ID: <34AF2942.2E4442E3@jclark.com> David Megginson wrote: > > James Clark writes: > > > I don't think using java.util.Dictionary is a good idea: > > > > 1. JDK 1.2 provides a new Map interface which replaces Dictionary. > > > > 2. java.util.Dictionary is an abstract base class not an interface. > > > > 3. java.util.dictionary is weakly typed: it doesn't enforce the > > requirement that keys be strings, and it requires values to be cast to > > strings. > > > > I think it would be much better to have an Attributes interface and also > > a convenience adapter class that provides a Dictionary implementation in > > terms of that interface. > > I would like to avoid java.util.Map to keep SAX applet-friendly (it > will be years before most browsers deployed support even 1.1). I agree. We should have an interface specifically for Attributes and implementations of Map and Dictonary on top of that. > I > agree that Dictionary is far less than ideal -- what do you imagine > the attributes interface looking like? /** * An XmlAttributeSet is a set of named attributes each * with a string value. * Both specified and defaulted values are included * and are not distinguished. * Implied attributes are not included. * The XML processor is free to modify the AttributeSet after the * application returns from startElement. * The application can use clone to make a copy of the AttributeSet * which will not be modified by the XML processor. */ public interface XmlAttributeSet extends Cloneable { /** * Return the value of the attribute with this name, or null is the * set does not include an attribute. */ String get(String name); /** * Return the number of attributes in the set. */ int getSize(); /** * Get the name of the i-th attribute, where i is greater than or * equal to 0 and less than the number of attributes in the set. * The order of the attributes is not defined. */ String getName(int i); /** * Get the value of the i-th attribute, where i is greater than or * equal to 0 and less than the number of attributes in the set. */ String getValue(int i); } You could use an Iterator or Enumeration instead of getSize/getName/getValue, but I think it would probably be more complicated and less efficient. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 11:23:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: problem areas (was Re: SAX: Whitespace Handling) In-Reply-To: <34AEEFC2.F5E6D094@technologist.com> References: <199801031802.NAA00401@unready.microstar.com> Message-ID: <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> Firstly can I congratulate everyone on the very high standard and value of the postings already. I am relying implicitly on DavidM to corral them, but they seem to be tending towards enough communality that a synthesis is possible. That synthesis will not give everyone everything they would like but will be workable. I am relying on DavidM to steer this if he feels: - he has got enough material on any one subtopic already. Please be sensitive to any requests for discontinuation of postings on a subtopic. - any subtopic is getting too complex. - there is merit in further interim proposals, etc. I am sure that you will all agree that we are very grateful to him and will continue to make sure that the postings help in making his task possible :-) There are some strategic issues that PaulP raises in his posting: At 21:11 03/01/98 -0500, Paul Prescod wrote: [...] > >You've mentioned this a few times, but I wonder if we are really making >a spec. for people who are not familiar with XML itself. Ignorable My impression is that although this is perhaps where some of us started (including myself) we are producing something which relies on a thorough understanding of XML. I am happy to go along with this. If SAX develops in the current way, it will be much easier to build "newbie" interfaces on top of it. [e.g. a newbie interface might omit any references to PIs.] >whitespace is an unfortunate fact of life (and entities are a fortunate I agree. I we do not support IWS then we shall frustrate many of the currently experienced XML/SGML community who are a major part of the XML implementation community. >fact of life) and people who want to work with XML parsers should be >familiar with XML concepts. All we should hide from them is the nitty >gritty syntax. > >> Tim Bray's recent comments on this list imply that a validating parser >> using SAX could report ignorable whitespace as regular character data >> and still be conforming; if I have inferred correctly, then I am >> willing to omit this callback. > >Could someone please show me where the spec. provides leeway for this >sort of thing? If SAX is meant to be usable with validating parsers >(e.g. parsers which report validation errors), then I feel that it >should support ignorable whitespace. On the other hand, if it is only >interested in the well-formedness level, then of course this is >irrelevant. > PaulP has touched on two of the areas that I think will give us most problems - "Validating parser" and "Whitespace". I agree with him that we must address both of these. I don't know if we need an additional question, "Should SAX support validating parsers?", but if so, my answer would be YES. I also take it as almost axiomatic that SAX should support everything in the spec *relevant to those areas it addresses*. IOW if it doesn't support NOTATION it could ignore everything to do with that (e.g. NDATA. NotationType) and might simply throw an Exception (SAX ignores NOTATION - or whatever). [I am not making any judgment on NOTATION - but it is possibly not a core component]. The problem with VPs and IWS is that they are not sufficiently fully defined *in the spec* that their interpretation is trivial. You have to read the spec extremely carefully and it often comes down to very small details. [I actually believe that there is more variety of opinion among the experts than some realise.] I am sure that in VPs and IWS we are moving into uncharted territory. I suspect we may have to either come up with minimalist or rather fuzzy implementations, which may get amended later in the light of experience OR further spec revisions. [Remember that we haven't seen the final result of the PR process - "minor changes" are still allowed.] on VPs I think it would be very valuable if someone could list what they think a "VP", or a "Beyond WF parser" should do. A BeyondWF parser - DavidM uses the phrase DTD-driven parser, which we may wish to adopt - can produce *different output* from a WF parser. It must normalise non-CDATA attribute values, for example. It must also report things such as the occurrence of IWS. It also may or must throw additional violations. This document ]> will give different values according to whether the ATTLIST is present (and used). The document above has enough information to be "validatable". Whether this "invokes a validating parser" (if the parser is capable of it) is not clear to me. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 11:23:44 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Prolog (question 8 of 10) In-Reply-To: <199801040117.UAA00580@unready.microstar.com> Message-ID: <3.0.1.16.19980104115411.2e07e93e@pop3.demon.co.uk> At 20:17 03/01/98 -0500, David Megginson wrote: > >Should SAX include events for the start and end of the prolog, and/or >for the DOCTYPE declaration? NO I originally included a lot of "DTD" support in JUMBO to try to deduce how to process the document. I've spent the weekend removing it in favour of: - stylesheets - namespaces For most documents the DOCTYPE will look something like (example from the HTML source of the XLL draft 970731): The Name HTML simply says that the root element of the document must be HTML (I assume that a validating parser will throw an error if it isn't). The FPI may be usable by some people, but that's another issue. The SystemID isn't much use on its own. DOCTYPEs will be problematic with WF documents from more than one namespace. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 11:26:56 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Naming and Packaging (question 10 of 10) In-Reply-To: <199801040147.UAA00705@unready.microstar.com> Message-ID: <3.0.1.16.19980104121951.2e07bbde@pop3.demon.co.uk> >It is essential that we have unlimited control over any namespace that >we use, and that the package name be neutral enough that those >programmers with strong allegiances to the S*n, M*******t, and >N******e armed camps all feel comfortable using it. This issue >affects only Java. Firstly, I would be personally entirely happy for it to be based on microstar.com. Without David's effort this would not be off the ground. However I know this is a potentially sensitive area and we must tackle it carefully. I agree strongly that it should be domain-name based. This requires an organisation (not just a person). Among the considerations are: - an organisation of permanency - an organisation of effective neutrality - a organisation with the trust of the community - an organisation that may (de facto) give some sort of blessing to this effort. - an organisation that does not wish to 'own' the effort. - an organisation which is not compromised by the effort. - there may be a resource implication - i.e. people will look to that org for the latest version, etc. This is a general problem and I'm sure attempts have been made to solve it already. I think that there may be non-commercial orgs who are aware of this problem (learned socs, international orgs, etc.) and > If there are people who have concerns that they wouldn't wish to post publicly, I am happy to receive them in confidence and - if necessary - to represent them anonymously. Classes ... > > com.microstar.sax.XmlParser the XML parser interface > com.microstar.sax.XmlApplication the XML application interface > com.microstar.sax.XmlAppBase the application base class, > or adaptor. > >Other possible names include "XmlProcessor" instead of "XmlParser" I am afraid that the language the spec uses is very confusing in that "processor" seems to be identical with what most people call "parser". I therefor think that "processor" should be avoided, even though it is the spec term. I also think that we should use spec terms wherever possible and refer to the spec. Thus if we have "getAttValue()" it should refer to [10] in the spec. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Jan 4 12:18:42 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Prolog (question 8 of 10) Message-ID: <199801041218.MAA01500@mail.iol.ie> >At 20:17 03/01/98 -0500, David Megginson wrote: >> >>Should SAX include events for the start and end of the prolog, and/or >>for the DOCTYPE declaration? > YES for the DOCTYPE. One of the things I think will emerge is a standard way for software to stare at a website full of XML documents and pull out only those XML documents of type "International Invoice Format" or "Universal Newswire Format" or whatever. Given that the root element can be any sub-tree of the DTD where else could this info go? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 14:56:35 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <34AEED7F.F082526A@technologist.com> References: <199801031750.MAA00346@unready.microstar.com> <34AEED7F.F082526A@technologist.com> Message-ID: <199801041451.JAA00298@unready.microstar.com> Paul Prescod writes: > I would prefer a getCurrentLocation() method that can be called > anywhere. I had considered adding something like that to XmlParser, but am not certain if the complexity is worth it -- if we allow it, then we might want to pass the current parser as an argument to each callback. All the best, David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 4 15:00:46 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> <34AED033.651B87E8@jclark.com> <199801040106.UAA00530@unready.microstar.com> <34AF0F90.83A7F720@jclark.com> Message-ID: <34AFA42B.B1DC26BE@technologist.com> James Clark wrote: > > I feel pretty strongly that the right way to handle fatal XML errors in > Java in a production-oriented interface is with an exception and that > SAX needs to define an exception to cover fatal XML errors. The > exception should extend IOException so that it works with the > java.net.ContentHandler stuff. It seems very easy to map from a notWellFormed event to a notWellFormed exception and essentially impossible to map from the exception back into an event (with context etc.). I would thus prefer to leave it up to the application programmer whether to throw an exception or try and gather more errors. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 15:15:44 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: problem areas (was Re: SAX: Whitespace Handling) In-Reply-To: <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> References: <199801031802.NAA00401@unready.microstar.com> <34AEEFC2.F5E6D094@technologist.com> <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> Message-ID: <199801041511.KAA00391@unready.microstar.com> Peter Murray-Rust writes: > I also take it as almost axiomatic that SAX should support everything in > the spec *relevant to those areas it addresses*. IOW if it doesn't support > NOTATION it could ignore everything to do with that (e.g. NDATA. > NotationType) and might simply throw an Exception (SAX ignores NOTATION - > or whatever). [I am not making any judgment on NOTATION - but it is > possibly not a core component]. I suggest that parsers using SAX should be more than welcome to provide their own mechanisms for communicating information about notations -- they are simply not part of the SAX information set. For example, I might have this in the DTD: and this in the document SAX will simply report that the attribute "object" has the value "clip", without worrying that there is a notation called "video"; however, ?lfred, for example, will let you look up the type of "object", find out that it's an entity, look up the associated notation, and then get the notation's system identifier. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 15:19:46 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: Namespaces In-Reply-To: <3.0.1.16.19980104115411.2e07e93e@pop3.demon.co.uk> References: <199801040117.UAA00580@unready.microstar.com> <3.0.1.16.19980104115411.2e07e93e@pop3.demon.co.uk> Message-ID: <199801041515.KAA00411@unready.microstar.com> Peter Murray-Rust writes: > DOCTYPEs will be problematic with WF documents from more than one namespace. Only if one particular namespace proposal is accepted. Personally, I (and many others) believe that architectural forms provide a much simpler, cleaner, and more powerful solution to the namespace problem -- a document using AFs can still be validated against a DTD, for example -- but perhaps the WG has already made up its collective mind to the contrary. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 15:48:52 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: problem areas (was Re: SAX: Whitespace Handling) In-Reply-To: <199801041511.KAA00391@unready.microstar.com> References: <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> <199801031802.NAA00401@unready.microstar.com> <34AEEFC2.F5E6D094@technologist.com> <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980104164504.1b172c36@pop3.demon.co.uk> At 10:11 04/01/98 -0500, David Megginson wrote: [... NOTATION example snipped...] > >SAX will simply report that the attribute "object" has the value >"clip", without worrying that there is a notation called "video"; >however, ?lfred, for example, will let you look up the type of >"object", find out that it's an entity, look up the associated >notation, and then get the notation's system identifier. This looks very reasonable to me. Should SAX report (or be prepared to report) that it has found documents components and skipped them? It must carry out some minimal parsing of the bits it ignores - perhaps "ignorable DTD". P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 17:55:40 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: NDATA Entities In-Reply-To: <3.0.1.16.19980104164504.1b172c36@pop3.demon.co.uk> References: <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> <199801031802.NAA00401@unready.microstar.com> <34AEEFC2.F5E6D094@technologist.com> <199801041511.KAA00391@unready.microstar.com> <3.0.1.16.19980104164504.1b172c36@pop3.demon.co.uk> Message-ID: <199801041750.MAA00305@unready.microstar.com> Peter Murray-Rust writes: > Should SAX report (or be prepared to report) that it has found > documents components and skipped them? It must carry out some > minimal parsing of the bits it ignores - perhaps "ignorable DTD". I don't think that there should be a problem, because a notation is not part of a document's structure -- you declare a notation in the DTD, but in the document itself it appears directly only as the value of an attribute of type NOTATION and indirectly as the notation attached to an NDATA entity. In other words, there is nowhere that we _could_ generate an event. Your question raises another problem, however. In the case of a NOTATION attribute, SAX will report the notation name as the attribute value, but will not indicate that it is, in fact, a notation; in the case of an ENTITY attribute, no information will be available through SAX directly beyond the entity name. In other words, as proposed so far, SAX has no provision for useful processing of NDATA entities, since there is no way to determine the system ID of an entity, the notation associated with an NDATA entity, or the system ID of a notation. ?lfred provides and will continue to provide this information outside of the SAX interface -- is there a strong case for making it part of SAX (in the XmlParser interface) or do we expect simple applications to stick with URIs for external addressing and non-XML objects? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:09:26 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: External Entity Start and End (question 2 of 10) Message-ID: <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> At 10:04 AM 03/01/98 -0500, David Megginson wrote: >Should SAX generate events for the start and end of an external >entity? I'd say no on this one. Reasons: - if we leave it out now we can add it later; the reverse is not true - one of my goals for SAX is to present to people who want to see XML as "just elements and attributes". I happen to think this is a reasonable way to want to look at XML; present entities as an *authoring* convenience; from the point of view of the downstream parser, your handy local XML parser makes the issue go away - I'm far from convinced that external entities will actually get that much use. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:10:33 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Whitespace Handling (question 5 of 10) Message-ID: <3.0.32.19980104095949.009b8880@pop.intergate.bc.ca> At 01:02 PM 03/01/98 -0500, David Megginson wrote: >Should SAX allow DTD-driven parsers to distinguish ignorable >whitespace from other character data? If you want to do this, the only reasonable way is with another argument on the charData() callback, so that it's always chardata, but some processors will in some circumstances signal that it's also ignorable. Since I think it would be highly unwise for any SAX-using application to have behavior dependent on the ignorability of some white space, I would argue strongly just for leaving this out. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:10:43 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Error Reporting (question 4 of 10) Message-ID: <3.0.32.19980104100920.009b8b70@pop.intergate.bc.ca> At 12:50 PM 03/01/98 -0500, David Megginson wrote: > public void warning (String message, String systemID, int line); > public void fatalError (String message, String systemID, int line); On this one, I agree with David and disagree with James. I don't see the advantages to using an exception. I think that a SAX processor should use fatal() (why the longer fatalError()?) - this has the advantage that you can, after the first message, go on looking for more fatal errors. Of course, a SAX processor must not, after the first fatal() callback, emit any more element() or charData() callbacks. I also think we should add a lineOffset argument, as someone suggested; but I don't think that SAX should *require* processors to do entity/line/offset tracking, since it's hard; SAX should specify that a value of -1 for any of these arguments means the processor doesn't know. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:10:46 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Processing Instructions (question 6 of 10) Message-ID: <3.0.32.19980104100838.009b8bd0@pop.intergate.bc.ca> At 01:12 PM 03/01/98 -0500, David Megginson wrote: >Should SAX implement a callback for processing instructions? Yes. As follows PI(String target, String remainder); Since a conformant processor has to split 'em up anyhow. If there's no remainder, i.e. , then the remainder argument should be the empty string "", not null. Or perhaps the remainder should be done with (char[], offset, length), like charData[] -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:10:56 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: towards a solution Message-ID: <3.0.32.19980104101009.009b88b0@pop.intergate.bc.ca> At 07:51 AM 03/01/98 -0500, David Megginson wrote: >We have had an interesting discussion about SAX ("simple API for XML"? >I cannot remember) the past few weeks, and now it's time to get >specific. > startElement (String name, java.util.Dictionary attributes) > endElement (String name) > charData (char ch[], int length) Sign me up with James on the inappropriateness of the Dictionary, and also as a supporter of his Attribute interface suggestion. Also, the charData should be (buf, offset, length) for consistency with Java culture. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:11:00 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Prolog (question 8 of 10) Message-ID: <3.0.32.19980104100256.009b8880@pop.intergate.bc.ca> At 08:17 PM 03/01/98 -0500, David Megginson wrote: >Should SAX include events for the start and end of the prolog, and/or >for the DOCTYPE declaration? No. Maybe later. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:11:03 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Parser Interface (question 9 of 10) Message-ID: <3.0.32.19980104100326.009b8880@pop.intergate.bc.ca> At 08:32 PM 03/01/98 -0500, David Megginson wrote: >So far, all of my questions have related to the callback interface >(XmlApplication); should SAX also include a simple interface for the >parser/processor itself, like the following? Yes. Why not? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:11:06 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Document Start and End (question 1 of 10) Message-ID: <3.0.32.19980104100731.009b8880@pop.intergate.bc.ca> At 09:47 AM 03/01/98 -0500, David Megginson wrote: >In addition to the core events, should SAX have additional callbacks >for the start and end of a document? Agreed. Could we change the first method to startDocument(String root, String DTDSysID, String DTDPubID) these obviously each being null in the event the document doesn't provide them? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:11:42 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Naming and Packaging (question 10 of 10) Message-ID: <3.0.32.19980104100853.009b8880@pop.intergate.bc.ca> At 08:47 PM 03/01/98 -0500, David Megginson wrote: >What package should the Java SAX belong to, and what should the >classes be named? Hmm. I own xml.com. I have not to date actually put it to use, but I plan to. Nonetheless, as its owner, I am happy to offer it for this purpose, and com.xml.sax has a nice ring. If anyone is feeling paranoid, I would be happy to sign an undertaking (a) never to interfere with or charge for this use of the xml.com namespace, and (b) should I sell xml.com sometime, to require the buyer to sign up to conditions (a) and (b). -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 18:11:45 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Comments (question 7 of 10) Message-ID: <3.0.32.19980104100148.009b8880@pop.intergate.bc.ca> At 01:21 PM 03/01/98 -0500, David Megginson wrote: >Should SAX include an event for comments? No. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sun Jan 4 18:20:15 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: towards a solution References: <199801031251.HAA00419@unready.microstar.com> <34AED641.164786BC@jclark.com> <199801040058.TAA00501@unready.microstar.com> Message-ID: <342EE982.7EE9F578@infinet.com> David Megginson wrote: > James Clark writes: > > > I don't think using java.util.Dictionary is a good idea: > > > > 1. JDK 1.2 provides a new Map interface which replaces Dictionary. > > > > 2. java.util.Dictionary is an abstract base class not an interface. > > > > 3. java.util.dictionary is weakly typed: it doesn't enforce the > > requirement that keys be strings, and it requires values to be cast to > > strings. > > > > I think it would be much better to have an Attributes interface and also > > a convenience adapter class that provides a Dictionary implementation in > > terms of that interface. > > I would like to avoid java.util.Map to keep SAX applet-friendly (it > will be years before most browsers deployed support even 1.1). I > agree that Dictionary is far less than ideal -- what do you imagine > the attributes interface looking like? I think JDK 1.2 support will be much sooner than for JDK 1.1 and should not matter at all for applications anyways. As, for Java Applets, SUN's Java Activator should take of that problem. It is not totally out yet, but the technical challenges of implementing something like this are minimal and I am surprised they did not do this a while ago. Using the new Collections classes in JDK 1.2 would be a much better idea IMHO. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sun Jan 4 18:27:26 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: towards a solution References: <199801031251.HAA00419@unready.microstar.com> <34AED641.164786BC@jclark.com> <199801040058.TAA00501@unready.microstar.com> <34AF2942.2E4442E3@jclark.com> Message-ID: <342EEB41.AD258050@infinet.com> James Clark wrote: > /** > * An XmlAttributeSet is a set of named attributes each > * with a string value. > * Both specified and defaulted values are included > * and are not distinguished. > * Implied attributes are not included. > * The XML processor is free to modify the AttributeSet after the > * application returns from startElement. > * The application can use clone to make a copy of the AttributeSet > * which will not be modified by the XML processor. > */ > > public interface XmlAttributeSet extends Cloneable { > /** > * Return the value of the attribute with this name, or null is the > * set does not include an attribute. > */ > String get(String name); > > /** > * Return the number of attributes in the set. > */ > int getSize(); > > /** > * Get the name of the i-th attribute, where i is greater than or > * equal to 0 and less than the number of attributes in the set. > * The order of the attributes is not defined. > */ > String getName(int i); > > /** > * Get the value of the i-th attribute, where i is greater than or > * equal to 0 and less than the number of attributes in the set. > */ > String getValue(int i); > } > > You could use an Iterator or Enumeration instead of > getSize/getName/getValue, but I think it would probably be more > complicated and less efficient. > > James For those looking at an OMG IDL language independent solution this might work... module XMLParser { interface XMLAttributeSet { string get(in string name); long getSize(); string getValue(in int i); }; }; CORBA 2.0 BTW will be a standard part of the JDK 1.2. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 18:33:11 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: External Entity Start and End (question 2 of 10) In-Reply-To: <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> References: <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> Message-ID: <199801041828.NAA00634@unready.microstar.com> Tim Bray writes: > - one of my goals for SAX is to present to people who want to see XML > as "just elements and attributes". I happen to think this is a > reasonable way to want to look at XML; present entities as an > *authoring* convenience; from the point of view of the downstream > parser, your handy local XML parser makes the issue go away In principle, I agree with you entirely -- the job of an XML parser is to present a document as a single, logical structure, regardless of its physical layout, and physical features like entities should have no place in SAX (except possibly in the case of entity resolution). As James has pointed out, however, the problem comes with the fact that many people are proposing the use of URI's instead of entities for external references. For example, imagine that I have a simple XML document, with no external entities except for the external DTD subset, at the location http://myhost.com/doc.xml: [...] Now, within the DTD, at http://yourhost.com/gendoc.dtd", the following PI appears: How do I resolve this relative URI? If I use the URI of the document root, then I will get http://myhost.com/stylesheet.xsl However, the DTD designer almost certainly intended this to resolve to http://yourhost.com/stylesheet.xsl The only way that I can resolve this correctly is if I know the URI of the current external entity. In Lark, you provide this information with a separate Entity argument to each callback; this is a legitimate (and more powerful) option, but from the perspective of SAX, it ends up complicating the entire API instead of just adding two easily-ignored callbacks. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 18:37:07 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Error Reporting (question 4 of 10) In-Reply-To: <3.0.32.19980104100920.009b8b70@pop.intergate.bc.ca> References: <3.0.32.19980104100920.009b8b70@pop.intergate.bc.ca> Message-ID: <199801041832.NAA00658@unready.microstar.com> Tim Bray writes: > On this one, I agree with David and disagree with James. I don't > see the advantages to using an exception. I think that a SAX processor > should use fatal() (why the longer fatalError()?) - this has the > advantage that you can, after the first message, go on looking for > more fatal errors. Of course, a SAX processor must not, after the first > fatal() callback, emit any more element() or charData() callbacks. The only problem here is that the element context could be useful for error reporting (i.e. "Error in in
in "). When XML documents are machine generated, this type of an error message might be more useful than a line number. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 18:41:44 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: Document Start and End (question 1 of 10) In-Reply-To: <3.0.32.19980104100731.009b8880@pop.intergate.bc.ca> References: <3.0.32.19980104100731.009b8880@pop.intergate.bc.ca> Message-ID: <199801041837.NAA00678@unready.microstar.com> Tim Bray writes: > Agreed. Could we change the first method to > startDocument(String root, String DTDSysID, String DTDPubID) > these obviously each being null in the event the document doesn't > provide them? -Tim This is a great idea, but it will require an implementation to queue events. For example, if I have I have to queue the PI event(s) until I have found the DOCTYPE, so that I will know the root element type and the URI of the external DTD. If this information is important (and others have argued that it is), then a separate docType() event would probably be appropriate. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sun Jan 4 18:46:01 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: do we want a base class References: <18271237114262@pragmaticainc.com> <16552194814198@pragmaticainc.com> <199801031251.HAA00419@unready.microstar.com> <199801031732.MAA00275@unready.microstar.com> <18271237114262@pragmaticainc.com> <18574174214275@pragmaticainc.com> Message-ID: <342EEFC2.56955EB@infinet.com> David Ornstein wrote: > David Megginson wrote [10:39 AM 1/3/98 ]: > >I should clarify: the SAX interfaces and the XmlAppBase base class > >will be written only once for each programming language, and they will > >live in their own package (in languages that use packages). > >XmlAppBase will depend only on the interfaces for information -- in > >other words, there will be no such thing as a SAX-implementing parser > >without the base class (parser writers need not be concerned with the > >base class at all). > > Ahh. I was assuming that we were aiming to be able to have the SAX Client > and SAX Implemention be in different languages. The practical case I'm > thinking about is that I write in C++ and would like to be able to use the > Java-based Implementations. Assuming a platform (for example win32) that > provides interoperability at the interface level between multiple > languages, or really any language-neutral IDL-based mechanism, shouldn't > that be possible? > > I guess we should answer that question first. > > If we say that, yes, we want cross-language interoperability (and IDL?), > then I maintain my stand. If not, I quickly retract... > > David > > As long as SAX does not have any language dependent features from Java such as extending classes in the core Java API. I think it would be much better to just have an interface that does not extend anything, and then leave the implementation details (whether or not to use java.util.Dictionary or the new java.util.Map class in JDK 1.2). In fact, it would be great if this sort of stuff was defined in OMG IDL and even Microsoft IDL (for DCOM) if at all possible. There are many times where ORB's like CORBA provide a bridge between new applications which may use XML and legacy applications like databases which are too inconvenient to get at with just Java and JDBC. So again, why not just do all of this in IDL? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 19:05:34 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:46 2004 Subject: SAX: External Entity Start and End (question 2 of 10) In-Reply-To: <199801041828.NAA00634@unready.microstar.com> References: <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980104200407.2f0ff666@pop3.demon.co.uk> At 13:28 04/01/98 -0500, David Megginson wrote: >However, the DTD designer almost certainly intended this to resolve to > > http://yourhost.com/stylesheet.xsl Agreed. > >The only way that I can resolve this correctly is if I know the URI of >the current external entity. In Lark, you provide this information >with a separate Entity argument to each callback; this is a legitimate >(and more powerful) option, but from the perspective of SAX, it ends >up complicating the entire API instead of just adding two >easily-ignored callbacks. In developing JUMBO I have found it necessary to keep track of URL/Is of the subcomponents of a document. I suspect that XMLers will very soon start using XML for distributed documents and wish to know where the components come from. Thus a standard document is increasingly likely to have transcluded information or meta-information (the stylesheet is an example). so, if it's not too difficult, including entity info could be much used. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 4 19:09:19 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Document Start and End (question 1 of 10) In-Reply-To: <199801041837.NAA00678@unready.microstar.com> References: <3.0.32.19980104100731.009b8880@pop.intergate.bc.ca> <3.0.32.19980104100731.009b8880@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980104195508.46a7e0de@pop3.demon.co.uk> At 13:37 04/01/98 -0500, David Megginson wrote: >Tim Bray writes: > > > Agreed. Could we change the first method to > > startDocument(String root, String DTDSysID, String DTDPubID) > > these obviously each being null in the event the document doesn't > > provide them? -Tim > >This is a great idea, but it will require an implementation to queue >events. For example, if I have > > > > >I have to queue the PI event(s) until I have found the DOCTYPE, so >that I will know the root element type and the URI of the external >DTD. If this information is important (and others have argued that it >is), then a separate docType() event would probably be appropriate. I have (sort of) run into this problem in JUMBO. Since I do not (yet) know how any PI is going to be used (or what additional PIs or PI-like syntax the WG will put forward) I have had to keep some sort of track of PIs regardless of where they come. It could be logical to say that all PIs before the DOCTYPE were ignored, but then we could find that the WG used such PIs for something in the future. It's probably also necessary to assume that order matters for PIs (i.e. they may be used to switch behaviour on or off). [That's the problem of trying to support something like PIs, whose function is (deliberately) left undefined :-)]. P. > > >All the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Jan 4 19:44:11 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Document Start and End (question 1 of 10) Message-ID: <3.0.32.19980104114340.009b3ae0@pop.intergate.bc.ca> At 01:37 PM 04/01/98 -0500, David Megginson wrote: > > Agreed. Could we change the first method to > > startDocument(String root, String DTDSysID, String DTDPubID) > > these obviously each being null in the event the document doesn't > > provide them? -Tim > >This is a great idea, but it will require an implementation to queue >events. For example, if I have > > > > >I have to queue the PI event(s) until I have found the DOCTYPE Why? The PI is 'before' the doc in any meaningful sense... so they get some PI events before they get the startDocument event. Is this a problem? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 22:32:26 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Document Start and End (question 1 of 10) In-Reply-To: <3.0.32.19980104114340.009b3ae0@pop.intergate.bc.ca> References: <3.0.32.19980104114340.009b3ae0@pop.intergate.bc.ca> Message-ID: <199801042227.RAA00267@unready.microstar.com> Tim Bray writes: > Why? The PI is 'before' the doc in any meaningful sense... so they > get some PI events before they get the startDocument event. Is this > a problem? -Tim Not a serious one, but it is an inconvenience -- in my original proposal, startDocument would be guaranteed to be the first event called (thus, it would be a convenient place to allocation structures, etc.). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Jan 4 22:50:46 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes and Entity Resolution Message-ID: <199801042245.RAA00338@unready.microstar.com> During our discussion this weekend, we have had two excellent proposals for additional SAX interfaces, beyond just XmlParser and XmlProcessor: 1. An interface for entity resolution, rather than using a resolveEntity callback. 2. An interface for attributes, rather than using java.lang.Dictionary. ENTITY RESOLUTION ----------------- While I agree that a full entity manager would be more powerful than a simple callback, I am not certain that the power will really be needed by most SAX users; furthermore, if it is needed, that functionality can be supplied more generally by an HTTP or FTP proxy server. For now, then, I recommend that we stick with the resolveEntity callback, which is simple to use and to learn, but provides 80% of the required functionality (that's 80% in the abstract 80/20 sense). ATTRIBUTES ---------- The good arguments and patient explanations of list members have convinced me that java.lang.Dictionary is unsuitable for three reasons: because it is a base class rather than an interface, because it is already deprecated in Java 1.2, and (most importantly) because there is no single, obvious equivalent in other programming languages. So what do we do? It is certainly tempting to introduce a new interface for attribute resolution, and that in itself would not bloat SAX too much, but if we did that, why not add other interfaces? It would be nice, for example, to have an Element interface, an Entity interface, a PI interface, a characterData interface, etc., all of which implement useful functionality; in the end, we will have rewritten the DOM. The alternative is to return to what I had originally done with ?lfred, and generate a separate event for each attribute: public void attribute (String elementName, String aname, String value); For example, with the following markup This is a paragraph. We would have five SAX events: attribute: elementName="para" aname="id" value="p1" attribute: elementName="para" aname="level" value="advanced" startElement: name="para" charData: ch="This is a paragraph." endElement: name="para" This is not pretty, but it is simple, and should translate cleanly to all languages. I am far from decided on this point, and encourage further public discussion. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Sun Jan 4 23:28:17 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes and Entity Resolution In-Reply-To: <199801042245.RAA00338@unready.microstar.com> Message-ID: <23271625715109@pragmaticainc.com> David Megginson wrote [02:45 PM 1/4/98 ]: >ATTRIBUTES >---------- > >The alternative is to return to what I had originally done with >?lfred, and generate a separate event for each attribute: > > public void attribute (String elementName, String aname, String value); I assume that this approach mandates that the Client accumulate these and then use them when the element comes along. Yick. It's OK, but not great. I'd rather the interface. I don't much buy the argument that adding another interface (or even two more) will lead us straight to DOM. ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Jan 4 23:40:15 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes Message-ID: <199801042340.XAA10363@mail.iol.ie> > >The alternative is to return to what I had originally done with >?lfred, and generate a separate event for each attribute: > > public void attribute (String elementName, String aname, String value); > Separate attribute events raises a question. Unlike SGML, XML does not specify that the order in which attribute values are supplied in a start-tag is insignificant to applications. (Perhaps I missed something it in the spec.?). A processor working sans DTD obviously cannot determine a declaration order in which to generate the events. Annex G of the SGML handbook just says the order is insignificant. However, nsgmls ESIS supplies then in declaration order and I for one have written apps that relied on that. Perhaps writing apps that rely on a specific order of attribute event arrival is just plain bad design! Anyway, the dictionary approach, supplying all the attributes in one go side-steps buffering and state space for simple little SAX apps. that want to pick up attributes X and Y of element Z and nothing else. While on this subject, what comes first, a start-tag event or its attribute event(s)? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From antony at n-space.com.au Mon Jan 5 00:47:00 1998 From: antony at n-space.com.au (Antony Blakey) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Comments (question 7 of 10) References: <199801031821.NAA00477@unready.microstar.com> Message-ID: <34B02D5D.10E91A98@n-space.com.au> David Megginson wrote: > Should SAX include an event for comments? > > public void comment (char ch[], int length); YES. Think of two tools you can't write without this: transformers and javadoc-style documentation tools. Javadoc is a revolution in documentation, not because it is particularly great in itself, but because it lowers the barrier to documentation production. Also, I often have to transform documents from authors, which then go back into the authoring process. Achieving identity is a requirement in this case. Our company uses XML/SGML for everything from documents to declarative multimedia product definitions, system configuration files, scripting environment management, configuration management, CORBA-style RPC (transport layer) and more. Once you start using XML as you canonical data format, the identity transform becomes critical. Just imagine if perl converted all characters above 127 to hex escapes on output. Or worse - deleted them. Furthermore, the CONs in this case don't seem all that compelling. In particular, point 2 (...might encourage comment abuses...) seems rather prescriptive. It's just a tool after all. > Another lexical feature that I am not discussing here is CDATA > sections; I assume that, when the parser is reporting character data, > it does not matter how the parser obtained those characters (in a > CDATA section, or in regular #PCDATA with the delimiters escaped using > references). I am happy, of course, to listen to other opinions on > this subject. This is neccessary for the identify transform. +----------------------------------+ | Antony Blakey | | N-Space Pty Ltd | | Java - CORBA - SGML - XML | | mailto:antony@n-space.com.au | | http://www.n-space.com.au | +----------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 5 01:18:59 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes In-Reply-To: <199801042340.XAA10363@mail.iol.ie> References: <199801042340.XAA10363@mail.iol.ie> Message-ID: <199801050113.UAA00313@unready.microstar.com> Sean Mc Grath writes: > Annex G of the SGML handbook just says the order is insignificant. However, > nsgmls ESIS supplies then in declaration order and I for one have written > apps that relied on that. Perhaps writing apps that rely on a specific > order of attribute event arrival is just plain bad design! The order of attributes should never be significant (how would you deal with defaulted attribute values?) -- perhaps making this point explicit would be a good minor change for the PR (Tim?). > Anyway, the dictionary approach, supplying all the attributes in one go > side-steps buffering and state space for simple little SAX apps. that want > to pick > up attributes X and Y of element Z and nothing else. > > While on this subject, what comes first, a start-tag event or its attribute > event(s)? I prefer to put the attributes first, so that all the information will have been delivered by the time the startElement event arrives. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 5 02:00:46 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Comments (question 7 of 10) In-Reply-To: <34B02D5D.10E91A98@n-space.com.au> References: <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> Message-ID: <199801050122.UAA00342@unready.microstar.com> Antony Blakey writes: > YES. Think of two tools you can't write without this: transformers and > javadoc-style documentation tools. Javadoc is a revolution in > documentation, not because it is particularly great in itself, but > because it lowers the barrier to documentation production. Also, I often > have to transform documents from authors, which then go back into the > authoring process. Achieving identity is a requirement in this case. Thank you very much for your comments. I'm not certain, however, that either of these is a strong argument for SAX. In the first case, SAX is designed to report not the physical appearance of the document but its logical structure -- it does not preserve internal entity references in data or attribute value literals, it does not distinguish defaulted attribute values from specified ones, it does not preserve the internal DTD subset, etc. etc. Even then, SAX will still be useful for doing identity transforms for a wide range of applications (say, transforming a document before formatting it), as long as the result of the transformation does not have to become the master authoring document. In the second case, I think that it would be a very bad idea to implement a JavaDoc-type facility using XML comments. JavaDoc has to use comments because it is not possible to extend Java syntax; XML allows you to define your own grammar, so the documentation can be part of the fundamental element structure. For example, instead of http://home.sprynet.com/sprynet/dmeggins/ dmeggins@microstar.com you should use Record for David Megginson http://home.sprynet.com/sprynet/dmeggins/ dmeggins@microstar.com All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From antony at n-space.com.au Mon Jan 5 02:13:26 1998 From: antony at n-space.com.au (Antony Blakey) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Comments (question 7 of 10) References: <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> <199801050122.UAA00342@unready.microstar.com> Message-ID: <34B041AA.8D805120@n-space.com.au> David Megginson wrote: > Thank you very much for your comments. I'm not certain, however, that > either of these is a strong argument for SAX. > > In the first case, SAX is designed to report not the physical > appearance of the document but its logical structure -- it does not > preserve internal entity references in data or attribute value > literals, it does not distinguish defaulted attribute values from > specified ones, it does not preserve the internal DTD subset, > etc. etc. Deep unjoy. Is there any reason not to do both of these things - It wouldn't be that difficult. I remember a saying 'Things should be as simple as possible, but no simpler'. It seems to me that this SAX effort might be letting the quest for simplicity eliminate a whole heap of useful applications. > In the second case, I think that it would be a very bad idea to > implement a JavaDoc-type facility using XML comments. JavaDoc has to > use comments because it is not possible to extend Java syntax; XML > allows you to define your own grammar, so the documentation can be > part of the fundamental element structure. For example, instead of > > > > http://home.sprynet.com/sprynet/dmeggins/ > dmeggins@microstar.com > > > you should use > > > Record for David Megginson > http://home.sprynet.com/sprynet/dmeggins/ > dmeggins@microstar.com > I agree, but your example implies that my comments were about the data, rather than about the structure itself - I guess I should have pointed out that I'm interested in comments in the DTD, so that the DTD can be documented automatically. This is more like javadoc/idldoc. I'd love an xmldoc tool. I'm guessing now that SAX doesn't give me DTD events. I guess SAX is not that useful for me given it's intention (although I'm pleased to see your effort). Back to the drawing board for me :( +----------------------------------+ | Antony Blakey | | N-Space Pty Ltd | | Java - CORBA - SGML - XML | | mailto:antony@n-space.com.au | | http://www.n-space.com.au | +----------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 5 03:23:40 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Status Report Message-ID: <199801050318.WAA01031@unready.microstar.com> Here is a preliminary status report on SAX, summarising both public and private correspondence. This is, in fact, _very_ preliminary, since some potential participants read e-mail only at work, and will not even know about this thread until tomorrow (Monday) morning. CORE EVENTS ----------- So far, there seems to be general agreement on the following event callbacks for the XmlApplication interface: public void startDocument (); public void endDocument (); public void endElement (String name); public void processingInstruction (String name, String remainder); There is general agreement that the following two should be present, but still discussion over their exact form (I'm still tweaking the names a bit): public void characters (char ch[], int start, int length, ...?); public void startElement (String name, ...?); (For the first, there is the question of a flag for ignorable whitespace, and for the second, the question of how to report attributes). ENTITIES -------- There has been a lively and well-informed discussion on entity handling. Many participants are comfortable with something like the following for external entities (including the external DTD subset, which may contain processing instructions): public void startEntity (String ename, String publicID, String systemID); public void endEntity (String ename); (There is also a question about whether public IDs should be provided). Some others suggest that SAX should provide no information about external entities, while others suggest that the XmlParser interface should have a getLocation() method instead. The main motivation for providing external-entity information (aside from error reporting) is to resolve relative URIs in attribute values. On the issue of entity resolution, there has been less feedback, probably because the topic is a little confusing. I have suggested something like this public String resolveEntity (String ename, String publicID, String systemID); which would allow simple URI substitution and resolution of public identifiers, if desired (in most cases, you could simply return the systemID argument unmodified). Another suggestion is a separate EntityManager interface which would allow much more functionality. ERROR REPORTING --------------- A majority of participants seem to support using callbacks for error reporting, partially to simplify cross-language support: public void warning (String message, int line, int column); public void fatal (String message, int line, int column); Note the addition of the 'column' argument -- it has rightly been pointed out that XML documents can consist of a single, long line, so the line number itself may be useless. If we do not have some general way to determine the current entity (i.e. startEntity and endEntity), we will also have to supply the URI of the current entity here. PROLOG ------ No one sees a need for startProlog and endProlog events, but several people would like to see an event for the DOCTYPE, if present: public void doctype (String name, String publicID, String systemID); where publicID and systemID refer to the external DTD subset, if any. This would help with autodetection of different document types. COMMENTS -------- Most people agree that there is no need for SAX to report comments. PARSER ------ Everyone seems to like the idea of a common parser interface. ARTIST'S RENDITION ------------------ Things are still up in the air, but here is some indication of what SAX's central XmlApplication interface might look like in Java: /* Beginning of XmlApplication.java */ public interface XmlApplication { // // Entities // public String resolveEntity (String ename, String publicID, String systemID); public void startEntity (String ename, String publicID, String systemID); public void endEntity (String ename); // // Document structure // public void startDocument (); public void endDocument (); public void doctype (String name, String publicID, String systemID); public void startElement (String name /* and attributes, somehow */); public void endElement (String name); public void characters (char ch[], int start, int length, boolean ignorable); public void processingInstruction (String name, String remainder); // // Error reporting // public void warning (String message, int line, int column); public void fatal (String message, int line, int column); } /* end of XmlApplication.java */ All of these would have default implementations in XmlAppBase -- the seven core document-structure callbacks would all have empty implementations. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 5 04:58:58 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> <34AED033.651B87E8@jclark.com> <199801040106.UAA00530@unready.microstar.com> <34AF0F90.83A7F720@jclark.com> <34AFA42B.B1DC26BE@technologist.com> Message-ID: <34B03C9F.767593B@jclark.com> Paul Prescod wrote: > > James Clark wrote: > > > > I feel pretty strongly that the right way to handle fatal XML errors in > > Java in a production-oriented interface is with an exception and that > > SAX needs to define an exception to cover fatal XML errors. The > > exception should extend IOException so that it works with the > > java.net.ContentHandler stuff. > > It seems very easy to map from a notWellFormed event to a notWellFormed > exception and essentially impossible to map from the exception back into > an event (with context etc.). Not at all. The exception can carry all the context that an event does (like URL, line number and so on). An exception can easily be mapped to an event: parser.setApp(app); try { parser.run(); } catch (XmlNotWellFormedException e) { parser.fatalError(e); } You can't generate more than one fatal error event with this approach, but that seems well out of the scope of a simple interface like SAX. Apart from the fact that it is the right thing to do (show me a Java API that uses a callback for a fatal error), there are two other reasons why throwing an exception is the right approach: - Representing information about the error as an object allows much better extensibility: implementations can extend XmlNotWellFormedException to provide richer error reporting, and this can be very conveniently exploited by applications. - A parser will read from a URL, thus it is already the case that it will generate IOExceptions, and thus applications have to be prepared to deal with this already. (I hope nobody is suggesting that the parser should try to catch IOExceptions.) By deriving XmlNotWellFormedException from IOException, an application doesn't have to right any additional code to deal with fatal errors. The fact that a parser needs to be able to throw IOExceptions makes deriving the parser interface from Runnable unworkable, because run can't throw any checked exceptions. Instead an app would need to create its own runnable that calls the parser inside a try statement, and catches and deals with any IOExceptions. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 5 04:59:22 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes and Entity Resolution References: <199801042245.RAA00338@unready.microstar.com> Message-ID: <34B04110.2F15EA17@jclark.com> David Megginson wrote: > While I agree that a full entity manager would be more powerful than a > simple callback, I am not certain that the power will really be needed > by most SAX users; furthermore, if it is needed, that functionality > can be supplied more generally by an HTTP or FTP proxy server. For > now, then, I recommend that we stick with the resolveEntity callback, > which is simple to use and to learn, but provides 80% of the required > functionality (that's 80% in the abstract 80/20 sense). I don't think the entity manager interface has to be any more complicated than a single resolveEntity callback. My main point is that this doesn't belong as part of the App. Putting separate pieces of functionality into separate interfaces does not make things harder to use and learn; on the contrary it makes it easier. > public void attribute (String elementName, String aname, String value); > > For example, with the following markup > > This is a paragraph. > > We would have five SAX events: > > attribute: elementName="para" aname="id" value="p1" > attribute: elementName="para" aname="level" value="advanced" > startElement: name="para" Putting attributes before the start element would be seriously confusing: in the markup the element name comes before the attributes, and the attributes are logically part of the element. Having an attribute callback that happens after the startElement makes some sense. It is to some extent arbitrary whether information is represented as subelements or attributes; having subelements and attributes be represented in a similar way would be consistent with this. I think I would also pass the element type name as an additional argument to the attribute call, since the name of an attribute is in general meaningful only in the context of a particular element type. It's also useful to know when the attributes have ended and the content is starting, and I would have a callback for this that also passed the element type name. This isn't pretty. An alternative simple approach would be to have startElement(String elemName, String[] attNames, String[] attVals, int nAtts); As with the charData callback, the parser would be free to mutate the arrays once the startElement method returns, and so it would not need to allocate two arrays for every element. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 5 04:59:40 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Document Start and End (question 1 of 10) References: <3.0.32.19980104114340.009b3ae0@pop.intergate.bc.ca> Message-ID: <34B04265.52EAB1E0@jclark.com> Tim Bray wrote: > > At 01:37 PM 04/01/98 -0500, David Megginson wrote: > > > Agreed. Could we change the first method to > > > startDocument(String root, String DTDSysID, String DTDPubID) > > > these obviously each being null in the event the document doesn't > > > provide them? -Tim > > > >This is a great idea, but it will require an implementation to queue > >events. For example, if I have > > > > > > > > > >I have to queue the PI event(s) until I have found the DOCTYPE > > Why? The PI is 'before' the doc in any meaningful sense... so they > get some PI events before they get the startDocument event. Is this > a problem? -Tim I think this is inconsistent with the XML spec: document ::= prolog element Misc* prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? PIs before the doctypedecl are part of the document. I also feel that it's wrong to give special treatment to the external identifier in the doctypedecl. is just a convenient shorthand for %doc; ]> James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at interlog.com Mon Jan 5 07:07:57 1998 From: smith at interlog.com (Chris Smith) Date: Mon Jun 7 16:59:47 2004 Subject: XML and Using It With Whitespace In-Reply-To: <3.0.1.16.19971211143739.37172be2@pop3.demon.co.uk> Message-ID: Sorry if the subject is confusing, but it's a really concise proposal for getting whitespace through - guaranteed. I earlier posted a query about the behaviour of various parsers surrounding whitespace. I guess I'm not as hopeful as I was earlier, at least based on the answers I received. Thanks to those who took the time to reply. Essentially, I had hoped that using to replace a space would allow for the creation of a 'magic' difference, the same way that the < and < are treated differently. Ideally all spaces could become and we could use the (invalid!) xml:space="none", leaving only the behind. It appears that most of the parsers will have a tough enough time consistently declaring ignorable whitespace in element content - track where in PCDATA a became a ' ' is just not on the radar. That doesn't mean I'm abandoning the idea - the message authentication we're doing is important enough to the application that I'm prepared to sacrifice the use of all the parsers to get the above behaviour. It doesn't hurt that we are likely going to have standalone applications processing the XML stream - it's not really a file-based system. I think parsers can still correctly read such files. But it points to a more general problem. If I read such a file with a parser, how can I write it out again exactly (and I mean *exactly*) the way it was read? If the parser doesn't indicate clearly where substitutions with entities were done, then I can't put them back in the file. The same problem occurs with empty elements. Although the XML spec wants to imply that and are the same, some might see them as the difference between a zero-length content and null content. Either way, if the original XML contains , then that is what should go back out. If it later contains then the both references should remain different from each other and unchanged. To wrap up the options, I'll run through the same paragraph using three different techniques. 1....Basic.

Finally, the other idea is the one at the bottom - use elements for spaces, tabs, and lineends. There is a single attribute n to indicate repeat counts.

2....Using character entities - still my favourite, since they work in attributes as well. Out of all of them, this, to my eyes, looks like it could easily have been placed in the XML 1.0 spec without breaking anything else that is in the spec, simply by adding the xml:space="none". &spc; could be and &lf; is so no new entities would have to be added.

Finally,&spc;the&spc;other&spc;idea&spc;is&spc;the &spc;one&spc;at&spc;the&spc;bottom&spc;-&spc;use&spc;elements&spc;for&lf; spaces,&spc;tabs,&spc;and&spc;lineends.&spc;&spc;There&spc;is&spc;a&spc; single&spc;attribute&spc;n&spc;to&spc;indicate&lf;repeat&spc;counts.

3.....With only elements.

Finally,theotherideaisthe oneatthebottom-useelementsfor spaces,tabs,andlineends.Thereisasingle attributentoindicaterepeatcounts.

Clearly, you must have the DTD to make sense of the last one! However, I see a rather interesting side-effect, namely that this one could likely be added using a namespace. (Tangent: any parsers experimenting with namespaces?) In summary, the distinction is, as a reply noted, between "wanted" whitespace and "unwanted" whitespace. The XML specification wants to leave it to the application because there are far more 'whitespace convention sets' than it is desirable to put in the spec. However, there are far more applications than there are 'whitespace convention sets', and the application designer wants to pick one, not reinvent the wheel. This seems to be the missing middle ground. How can we reusably specify the relatively few whitespace options we need. which are more than the XML spec provides, but far fewer in number than the number of applications that we hope to see using XML? --------------------------------------------------------------------------- Chris Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Jan 5 07:48:50 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Status Report In-Reply-To: <199801050318.WAA01031@unready.microstar.com> Message-ID: <3.0.1.16.19980105084810.3a27f52c@pop3.demon.co.uk> Thanks *very* much indeed, David Personally I am very excited about the way it looks like coming together. At 22:18 04/01/98 -0500, David Megginson wrote: >Here is a preliminary status report on SAX, summarising both public >and private correspondence. This is, in fact, _very_ preliminary, >since some potential participants read e-mail only at work, and will >not even know about this thread until tomorrow (Monday) morning. To re-emphasise this. Some countries have extended holidays for GregorianYear++ and this may be their first indication that detailed feedback on SAX is requested. If you agree with the progress so far, but do not need to add anything new, let DavidM know privately. Answering the questions either publicly or privately will continue to be extremely valuable, because on some issues there may be genuine differences of opinion which cannot easily be reconciled. For example, we may need to agree that SAX is not suitable for certain processes (e.g. maintaining the master document during authoring/editing). > P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Mon Jan 5 10:35:12 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Whitespace Handling (question 5 of 10) Message-ID: <01bd19c5$54df8b70$a00b0ac0@pharcyde.poetsoftware.xo.com> >At 01:02 PM 03/01/98 -0500, David Megginson wrote: >>Should SAX allow DTD-driven parsers to distinguish ignorable >>whitespace from other character data? > >If you want to do this, the only reasonable way is with another >argument on the charData() callback, so that it's always chardata, >but some processors will in some circumstances signal that it's >also ignorable. > >Since I think it would be highly unwise for any SAX-using >application to have behavior dependent on the ignorability of >some white space, I would argue strongly just for leaving >this out. -Tim I am pretty leery of arguments along the line of "if we allow this, people will abuse it". There are certainly cases where this information is essential, so why lock out certain classes of applications for what essentially amounts to a single boolean parameter, which could be defaulted? For example, consider an application that takes an HTML document augmented with XML tags which are to be converted to text or HTML by some mechanism for viewing in a HTML browser. If the document reads something like: ... First line. Second line. ... I am sure there are plenty of similar examples when one DTD is being used to generate another, viewable one. This is a perfect SAX application since it doesn't require any funky comments, entity resolution, etc., but if there is no indication of which whitespace is ignorable, it is impossible to implement since you get spurious carriage returns and spaces in the generated output. BTW: IMHO, IFF there is going to be a "default implementation" anyway, I would actually prefer an "ignorableWhitespace" method which calls charData by default. This will permit cleaner implementations. Is text containing *only* whitespace inside an "ambiguous" area of a mixed content model considered to be ignorable? Regards, Matthew ------------------------------------------------ Matthew Gertner Project Manager/Architect, Internet/Document Management POET Software GmbH Tel: +49 (40) 609 90254 Fax: +49 (40) 609 90115 E-mail: matthewg@poet.de ------------------------------------------------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Mon Jan 5 11:05:20 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Error Reporting (question 4 of 10) Message-ID: <01bd19c9$8982bf10$a00b0ac0@pharcyde.poetsoftware.xo.com> >Tim Bray writes: > > > On this one, I agree with David and disagree with James. I don't > > see the advantages to using an exception. I think that a SAX processor > > should use fatal() (why the longer fatalError()?) - this has the > > advantage that you can, after the first message, go on looking for > > more fatal errors. Of course, a SAX processor must not, after the first > > fatal() callback, emit any more element() or charData() callbacks. > >The only problem here is that the element context could be useful for >error reporting (i.e. "Error in in
in >"). When XML documents are machine generated, this type >of an error message might be more useful than a line number. Obviously the consumer could be made responsible for maintaining an element and/or entity stack (cf. Jame's early message on this topic) without putting additional onus on the parser to hold this information. On the other hand, it might be worth considering using an XMLLocation class or suchlike to hold the line number and offset information for the calls which require this information, rather than passing them as separate parameters. This would enable "consenting" producer/consumer pairs to use a derived class (e.g. XMLFatalErrorLocation) containing additional information provided by the parser. It would also enable future extensions without changing the interface. Matthew ------------------------------------------------ Matthew Gertner Project Manager/Architect, Internet/Document Management POET Software GmbH Tel: +49 (40) 609 90254 Fax: +49 (40) 609 90115 E-mail: matthewg@poet.de ------------------------------------------------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:06:08 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: External Entity Start and End (question 2 of 10) References: <3.0.32.19980104100721.009b8c60@pop.intergate.bc.ca> <199801041828.NAA00634@unready.microstar.com> Message-ID: <34AFEB2C.9707872B@technologist.com> David Megginson wrote: > > The only way that I can resolve this correctly is if I know the URI of > the current external entity. In Lark, you provide this information > with a separate Entity argument to each callback; this is a legitimate > (and more powerful) option, but from the perspective of SAX, it ends > up complicating the entire API instead of just adding two > easily-ignored callbacks. It is possible to have "invisible arguments" to a callback that do not complicate the interface. This sounds very mysterious, but an "invisible argument" is just a variable whose value can be fetched through a parser method. getCurrentEntity() or getFileEncoding() would be examples. I think that this is the right way to handle this entity problem. Require the parser to track the information rather than the application. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:06:20 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Document Start and End (question 1 of 10) References: <3.0.32.19980104114340.009b3ae0@pop.intergate.bc.ca> Message-ID: <34AFEC5D.AFAC465B@technologist.com> Tim Bray wrote: > > > > > > > >I have to queue the PI event(s) until I have found the DOCTYPE > > Why? The PI is 'before' the doc in any meaningful sense... so they > get some PI events before they get the startDocument event. Is this > a problem? -Tim The XML document starts with the production labelled "document." This includes the PIs before the doctypedecl. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:07:39 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Attributes and Entity Resolution References: <199801042245.RAA00338@unready.microstar.com> Message-ID: <34B0ACF8.95B14747@technologist.com> David Megginson wrote: > It is certainly tempting to introduce a new > interface for attribute resolution, and that in itself would not bloat > SAX too much, but if we did that, why not add other interfaces? I think that we can define a general "map" interface that will translate easily into any language and is more general than attributes. As SAX supersets are created it can return notations by name, entities by name, element types by name and so forth. My concerns with the attributeEvent interface are: * an implication that attributes are ordered * performance (I often ignore more attributes than I use -- I don't want one interface-lookup-based method call per attribute) * either * the non-intuitive convention of having attributes FIRST or * the inconvenience of having them come later When I process a start-tag, I want all of the attributes to be queriable and I would much rather not be forced to gather them up myself. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:07:56 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Entity Resolution References: <199801042245.RAA00338@unready.microstar.com> Message-ID: <34B0AC87.B062C6E2@technologist.com> David Megginson wrote: > ENTITY RESOLUTION > ----------------- > > While I agree that a full entity manager would be more powerful than a > simple callback, I am not certain that the power will really be needed > by most SAX users; furthermore, if it is needed, that functionality > can be supplied more generally by an HTTP or FTP proxy server. For > now, then, I recommend that we stick with the resolveEntity callback, > which is simple to use and to learn, but provides 80% of the required > functionality (that's 80% in the abstract 80/20 sense). I agree with James on this one. I think that the entity manager interface is actually simpler in several senses: #1. It more perfectly allows you to ignore entities if you don't care about them. Think of the difference between 1.0 AWT and 1.1 AWT event handling. In the former you implement certain callbacks to get certain behaviour. In the latter, you register callbacks. In the old-style interface, it was only possible to make a simple applet that ignored most events by using the magic of inheritance (which we should not depend on too heavily). In the new-style interface you can ignore a particular object's events by merely not registering a handler for them. I contend that the latter is simpler in the case where you don't care about the events. #2. It more perfectly aligns with the language and intent of the SGML spec. where an entity handler is a distinct and important code module. XML-Lang does not specify the concept of an entity handler, but those of us from the SGML world know it to be a useful organizing concept and I think we can help the XML new-comers by promoting it. #3. The role of the XML App is not to provide information, but to consume it. I think that mixing up these responsibilities is confusing and complicates the construction of XML Apps. -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:07:38 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Comments (question 7 of 10) References: <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> <199801050122.UAA00342@unready.microstar.com> <34B041AA.8D805120@n-space.com.au> Message-ID: <34B0A549.F3B1E0A4@technologist.com> Antony Blakey wrote: > I agree, but your example implies that my comments were about the data, > rather than about the structure itself - I guess I should have pointed > out that I'm interested in comments in the DTD, so that the DTD can be > documented automatically. The way this is often done is to write a DTD for DTDs, write DTDs in XML instance syntax and translate that syntax into XML DTD syntax for machine processing. The strength of XML instance syntax is that it is infinitely extensible (like any other XML DTD). BTW, The weakness of ALWAYS using the XML instance syntax is that it is infinitely extensible...like HTML. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:08:07 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Status Report References: <199801050318.WAA01031@unready.microstar.com> Message-ID: <34B0B2D2.2E6EBC7E@technologist.com> David Megginson wrote: > All of these would have default implementations in XmlAppBase -- the > seven core document-structure callbacks would all have empty > implementations. The default implementations are an important part of the SAX API. I would appreciate if you could specify them in future status reports. If the XmlAppBase is complicated, then we should require SAX implementors to provide it. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:08:19 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> <34AED033.651B87E8@jclark.com> <199801040106.UAA00530@unready.microstar.com> <34AF0F90.83A7F720@jclark.com> <34AFA42B.B1DC26BE@technologist.com> <34B03C9F.767593B@jclark.com> Message-ID: <34B0B175.A39FE17@technologist.com> James Clark wrote: > > > It seems very easy to map from a notWellFormed event to a notWellFormed > > exception and essentially impossible to map from the exception back into > > an event (with context etc.). > > Not at all. The exception can carry all the context that an event does > (like URL, line number and so on). I meant execution content -- the continuation. > You can't generate more than one fatal error event with this approach, > but that seems well out of the scope of a simple interface like SAX. I guess this is where we disagree. I don't see anything complicated about the event approach and see no reason to restrict people to catching a single well-formedness error. After all, we agree that there should be a callback for non-"fatal" errors, so adding another for "fatal" errors seems simpler to me than inventing an exception. > Apart from the fact that it is the right thing to do (show me a Java API > that uses a callback for a fatal error), The XML spec's definition of "fatal error" is quite different from common usage. In particular there is a lot of language about continuing after the fatal error. "The reports of the error's fatality are much exaggerated." What you are proposing strikes me as analogous to disallowing AWT window event handling after the window's "close" event. Sometimes there are good reasons for continuing... > there are two other reasons why > throwing an exception is the right approach: > > - Representing information about the error as an object allows much > better extensibility: implementations can extend > XmlNotWellFormedException to provide richer error reporting, and this > can be very conveniently exploited by applications. There are a dozen other places in SAX where this argument could be used. Most specifically: why should we have such precise, rich information about fatal errors and not about other errors? Wby not have an extensible Error interface that can be used in both callbacks. Before I am accused of starting down the slippery slope, I want to point out that errors are a special case in that the XML spec. does not specify a fixed, finite set of interesting information about them. > - A parser will read from a URL, thus it is already the case that it > will generate IOExceptions, and thus applications have to be prepared to > deal with this already. (I hope nobody is suggesting that the parser > should try to catch IOExceptions.) By deriving > XmlNotWellFormedException from IOException, an application doesn't have > to right any additional code to deal with fatal errors. True, but they must already handle non-fatal errors, so the extra effort in doing the same for fatal errors seems pretty small. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 11:08:25 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Parser Interface (question 9 of 10) References: <199801040132.UAA00644@unready.microstar.com> Message-ID: <34AEF815.75ADCB12@technologist.com> David Megginson wrote: > - it will be necessary to write separate frontends for some parsers to > implement this functionality; I'm not sure I understand this. Doesn't SAX pretty much imply separate front-ends? > - extending java.lang.Runnable is Java-specific, and the concept may > not translate well to other languages. I don't see this as a big problem. The IDL spec. will just leave this constraint out. On the other hand, are you really ready to force every SAX implementor to think through the concurrency issues in Java? Would your Java interface description define synchronization etc.? > Note that the zero-argument constructor is required for this sort of > approach. It's too bad that Java has this arbitrary limitation. This setFoo, setBar, run() interface is really ugly. newInstances should take an array of objects as parameters to the constructor (not that SAX can change that!!). Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 12:03:57 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: Error Reporting (question 4 of 10) References: <199801031750.MAA00346@unready.microstar.com> <34AEED7F.F082526A@technologist.com> <199801041451.JAA00298@unready.microstar.com> Message-ID: <34B0C45D.B28BA2A@technologist.com> David Megginson wrote: > > Paul Prescod writes: > > > I would prefer a getCurrentLocation() method that can be called > > anywhere. > > I had considered adding something like that to XmlParser, but am not > certain if the complexity is worth it -- if we allow it, then we might > want to pass the current parser as an argument to each callback. I would say: let the application cache it. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 5 12:04:29 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:47 2004 Subject: SAX: NDATA Entities References: <3.0.1.16.19980104113953.2e0720aa@pop3.demon.co.uk> <199801031802.NAA00401@unready.microstar.com> <34AEEFC2.F5E6D094@technologist.com> <199801041511.KAA00391@unready.microstar.com> <3.0.1.16.19980104164504.1b172c36@pop3.demon.co.uk> <199801041750.MAA00305@unready.microstar.com> Message-ID: <34B0C3C7.66F709B0@technologist.com> David Megginson wrote: > Your question raises another problem, however. In the case of a > NOTATION attribute, SAX will report the notation name as the attribute > value, but will not indicate that it is, in fact, a notation; in the > case of an ENTITY attribute, no information will be available through > SAX directly beyond the entity name. In other words, as proposed so > far, SAX has no provision for useful processing of NDATA entities, > since there is no way to determine the system ID of an entity, the > notation associated with an NDATA entity, or the system ID of a > notation. > > ?lfred provides and will continue to provide this information outside > of the SAX interface -- is there a strong case for making it part of > SAX (in the XmlParser interface) or do we expect simple applications > to stick with URIs for external addressing and non-XML objects? I suspect that SAX will strongly influence convention in this area. I do not believe that SAX should undermine the intention of the XML spec. by making this information unavailable. If XML's creators hadn't wanted NDATA entities to be useful, they wouldn't have created them. I don't think that this is the right time or place to rethink that decision. To put it another way: SAX should simplify access to XML data. It should not simplify XML by doing away with information necessary to take advantage of basic XML features. When I am designing my XML information system I don't want to have to change my documents to work around arbitrary limitations in my parser. That's what XML was supposed to prevent. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From brad at enigma.co.il Mon Jan 5 12:46:57 1998 From: brad at enigma.co.il (Brad Young) Date: Mon Jun 7 16:59:47 2004 Subject: Update: JavaScript XML Parser Message-ID: <01bd19d8$a42a2e20$a11acbc7@brad.enigma.co.il> Kudos Jeremie. I'm working on a way to fool it into taking info directly from a URL. It's a kludge, but it should work. I'll post it as soon as it is stable. Brad _____________ Brad Young Enigma brad@enigmainc.com http://www.enigmainc.com -----Original Message----- From: Jeremie Miller To: xml-dev@ic.ac.uk Date: Saturday, January 03, 1998 5:20 AM Subject: Update: JavaScript XML Parser >I just finished updating the parser to fix a few bugs and add some more >features. It now parses CDATA, PI's, and Comments. > >As suggested, I'll be reading the DOM spec and add some compatibility to the >API. I also looked at the XSL proposal and it looks like I can easily do >some of the simplier parts of it. The only drawback to using JavaScript has >been the inability to directly access URL's, all of the input has to be >pasted in via a textarea. > >Thanks for the feedback so far, hopefully this can be put to some good use! > >http://www.jeremie.com/xparse/ > >Also, I have LOTS of questions about the XML spec, but I am going to bite my >toung until I get a better grasp on it... > >Jeremie Miller >jer@jeremie.com >http://www.jeremie.com/ > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Mon Jan 5 13:15:31 1998 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 16:59:48 2004 Subject: Listadmin: Errors and other happenings Message-ID: Dear all, Just a note to let you all know I am doing some "spring cleaning" of the list. Other than undesirable postings (more of which later), the holiday period has brought with it a larger than normal deluge of errors resulting from the listserver being unable to deliver messages. Over this weekend, I received some 1300 such errors. These fall into several types a) people subscribing as subscribe xml-dev email@host, where email@host is erroneous, and cannot be delivered to. I try to spot these (about 2-3 a week occur) and to unsubscribe them. So if you have a colleague who thinks they have subscribed but are not getting postings, this is probably what happened. I do not have the time to reconcile these errors with the original subscription attempt, and to inform the sender at their other address of the error b) Mail boxes that do not work because of internal errors, or most often because the local hard disk is full, ie procmail: Quota exceeded while writing "/var/spool/mail/cs" 550 ... Can't create output: Error 0 These tend to go away with time. c) Message that cannot be delivered because some vital relay is down, ie The address to which the message has not yet been delivered is: sardet@ensma.fr No action is required on your part. Delivery attempts will continue for some time, and this message will be repeated at intervals if the message remains undelivered These too go away in time. d) Messages that cannot be delivered because some local firewall does not have permissions set to do so, ie chris@moonvine.org: SMTP error from remote mailer after RCPT TO: : host mail.eden.com [199.171.21.14]: 550 ... we do not relay Those that look permanent, ie a) and d) I tend to unsubscribe without warning to the originators. Finally, the Hypermail archive should be back in action now, so keyword searches etc can resume. May you all have a highly productive 1998!! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Jan 5 14:17:27 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Whitespace Handling (question 5 of 10) Message-ID: <01bd19e4$91a3a220$1e09e391@mhklaptop.bra01.icl.co.uk> >BTW: IMHO, IFF there is going to be a "default implementation" anyway, I >would actually prefer an "ignorableWhitespace" method which calls charData >by default. This will permit cleaner implementations. I may be simple-minded, but surely the default action with ignorable white space should be to ignore it? Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mike at datachannel.com Mon Jan 5 17:29:28 1998 From: mike at datachannel.com (Mike Dierken) Date: Mon Jun 7 16:59:48 2004 Subject: Update: JavaScript XML Parser Message-ID: <01BD19BB.EE8EB2F0@NEMO> The simplest way I know to deal with URL streams in JavaScript is to create a simple Java applet to do a few things, insert it on a page and call its methods from JavaScript. You still run into security issues that prevent you from accessing domains outside the source of the page, but its a start. Mike D DataChannel -----Original Message----- From: Brad Young [SMTP:brad@enigma.co.il] Sent: Monday, January 05, 1998 4:51 AM To: Jeremie Miller; xml-dev@ic.ac.uk Subject: Re: Update: JavaScript XML Parser Kudos Jeremie. I'm working on a way to fool it into taking info directly from a URL. It's a kludge, but it should work. I'll post it as soon as it is stable. Brad _____________ Brad Young Enigma brad@enigmainc.com http://www.enigmainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Mon Jan 5 17:52:03 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: anti-goals In-Reply-To: <3.0.1.16.19980105084810.3a27f52c@pop3.demon.co.uk> References: <199801050318.WAA01031@unready.microstar.com> Message-ID: <17504048415690@pragmaticainc.com> Hello all, In the ongoing design discussion we've been having, it seems that there's a lack of clarity about what SAX will be used for and what it won't. This is slowing down and creating confusion in some of the discussions. While attempting to enumerate goals/requirements is a good idea at the start of a process, it can sometimes be quite difficult because people get hung up on being complete. Instead, one approach I sometimes take is to define "anti-goals." An anti-goal is a declaration of what one is *not* trying to do/support/achieve in a design. They are not things you're trying to avoid, in fact some of the best anti-goals are ones that describe things that would be very nice to have but have agreed to place out of scope. The incremental accumulation of anti-goals during a design process can help keep things on track. And the list can later be used when retrospectively constructing a list of a project's goals. My read is that we seem to have a strong leaning towards the following anti-goals: * SAX is not being designed to support configurations where the parser is in one language and the client is in a different language * SAX is not being designed to support identity transformations of an XML document('s physical structure) * SAX is not being designed to be usable in building a DOM tree [Actually, my personal suspicion is that this one is likely to generate lots of hot debate.] Do others agree/disagree with these as our anti-goals? If so, I'd propose that, to help keep us (and new folks) on track, they be included in any summaries of the SAX work we put out. I don't necessarily like all of these goals, personally, as they reduce SAX's utility for me. I'm just suggesting that they do seem to where we're heading. David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Mon Jan 5 18:09:55 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Status Report In-Reply-To: <34B0B2D2.2E6EBC7E@technologist.com> References: <199801050318.WAA01031@unready.microstar.com> Message-ID: <18085539815711@pragmaticainc.com> Paul Prescod wrote [02:15 AM 1/5/98 ]: >David Megginson wrote: >> All of these would have default implementations in XmlAppBase -- the >> seven core document-structure callbacks would all have empty >> implementations. > >The default implementations are an important part of the SAX API. I >would appreciate if you could specify them in future status reports. If >the XmlAppBase is complicated, then we should require SAX implementors >to provide it. 100% YES. ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Jan 5 19:15:37 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:48 2004 Subject: Lark 1.0 final beta and Larval 0.8 Message-ID: <3.0.32.19980105110742.008624d0@pop.intergate.bc.ca> This isn't finished yet, but I am uncomfortable about the fact that for the last couple of months, there has not been a java-language XML syntax-checker that is really very close to the spec. So, at http://www.textuality.com/Lark/ I have placed the Lark 1.0 final beta, and release 0.8 of Larval, a validating XML processor based on Lark. --------------- Status Snapshot --------------- In my tests, Lark does all the things it used to do, and also rejects 163 of 164 of James' non-well-formed documents; the odd-doc-out is the notorious 088.xml, which I consider to be well-formed and represents a policy issue that the WG is going to have to make a call on. The only hole I know about in Lark at the moment is that it doesn't do text declarations in external parsed entities; but I won't have time to work on it until next week, so decided to ship anyhow. James' test-suite represents a tremendous resource: a de-facto reproducible test of conformance that will greatly increase the interoperability of XML docs. We are all considerably in his debt, not for the first time; thank you once again, James. Larval validates quite a few things, and boots out quite a few other things, but has not been tested to anywhere near the same level that Lark has. These class files have been compiled with Microsoft VJ++1.1 and tested with Microsoft JView and with Sun's Java from JDK 1.1.3. At the moment, if I compile with the Sun fastjavac, then neither the Sun nor Microsoft java interpreters can use the resulting class files. Admittedly, Lark.java and Larval.java are a pretty severe strain on a compiler; on the other I know about some pretty egregious violations of the Java language spec that will get by both of those compilers. I suspect that my current problem with fastjavac is as likely to be me breaking some rule about what can be in a static string (J++ is forgiving) as it is a compiler bug. ---------------- Source Available ---------------- There's a policy change in that the Java source code for every Lark class is now included in the distribution. If you actually look at Lark.java and Larval.java, you'll see that this is not quite as generous as it sounds. ------------ Still Undone ------------ Lark 1.0 has also not received a walk-through looking for dead code, software rot, and unconcealed evidence of stupidity, and has not been profiled. It is noticeably but not unbearably slower than 0.97, but it'll be faster before I'm done. I have established with previous releases that with a little work any given release of Lark can be made faster and smaller. This release has grown in size by 10K. Lark's UTF8 processing is still pretty shaky - I think that the Java libraries are moving in the right direction fast enough to make it not cost-effective for me to wrangle with this much more at the moment. Since XmlInputStream is now available at source level, if someone were to want to plug in some robust UTF8 code that'd be lovely. Everything else is conformant I think without exception. ---------- Validation ---------- Larval is just another version of Lark; but it has some more methods, most noticeably public void validate(boolean) which as a side-effect turns on processExternalEntities; there is a new validityError() callback in the Handler. Of course there are a bunch of new classes with names like DTD and Validator and Attlist and so on. Larval is done this way because if you just use Lark, you'll never have to include any validation class files. I can get away with this because even though Java doesn't have a preprocessor, Lark does. Presumably I will use the same trick to do SAX. The validation implementation is pretty naive. Rather than compiling tables, Larval builds a data structure more or less isomorphic to the declaration in the DTD, and then laboriously pokes around in it every time it sees a start/end tag. I think it proves that (a) a naive implementation of validation can be done, and (b) this isn't the right way to do it in the long-term. However, it's nowhere near as slow as I expected, and is good enough to be useful already in debugging XML documents. ------------- Other Changes ------------- The doPI method now has separate args for target and remainder. There is a doXmlDeclaration method. There is a new method to tell Lark what name it should use for the document Entity, e.g. in error reporting. There is an ESIS class that extends Handler; I don't claim this to be anything like a real SGML ESIS, but it's sure useful in automated testing. ------------ Future Plans ------------ Lark's version will remain 1.0 as long as XML does (a long time, I hope). Once it's no longer 'final beta' Lark.toString() will add a build date-stamp to the "1.0" version string. Larval will progress toward 1.0 as I get around to doing some really serious testing on it. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mje at shakha.com Tue Jan 6 00:43:17 1998 From: mje at shakha.com (Matthew J. Evans) Date: Mon Jun 7 16:59:48 2004 Subject: SAX and Unicode question Message-ID: <01BD1A00.85734C50.mje@shakha.com> (Please forgive me if this post is ill-received.) How is SAX going to handle Unicode, especially sending 16-bit chars (UTF-16) to callback functions? Sending void*'s and/or char*'s in the callbacks will leave the application and/or parser guessing what was sent. Sending byte order marks in every string seems rather impractical, especially since UTF-16 can have null bytes making most string objects useless anyway. (sorry, my Java is NULL. But from what I can tell, the String and String_buffer classes do not support 16- or 32-bit chars - correct me if I'm wrong) As a developer, it would be very nice not to have to re-code support into my applications. I would like to see some implementation of Unicode in SAX that is compatible with most systems and is extensible for when new standards come along. (Wide character and encoding support is lacking in most software languages). I do have a couple of ideas if you would like them (omitted for brevity). - Matthew <<<<<<< | >>>>>>> Matthew J. Evans Professional Hobbyist Santa Fe, New Mexico mailto:mje@shakha.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Jan 6 00:49:53 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:48 2004 Subject: SAX and Unicode question Message-ID: <3.0.32.19980105164956.009d06a0@pop.intergate.bc.ca> At 05:36 PM 05/01/98 -0700, Matthew J. Evans wrote: >How is SAX going to handle Unicode, especially sending 16-bit chars >(UTF-16) to callback functions? Sending void*'s and/or char*'s in the >callbacks will leave the application and/or parser guessing what was sent. >Sending byte order marks in every string seems rather impractical, >especially since UTF-16 can have null bytes making most string objects >useless anyway. SAX is a Java interface. Thus the Strings and chars and so no are all 16-bit-only; the parser will have taken care of all the BOMs and encoding jiggery-pokery and so on. On the IDL end of things, not sure what the right way to do it is. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Jan 6 01:13:27 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Document Start and End (question 1 of 10) Message-ID: <000401bd1a3f$bf562e80$2ee044c6@donpark> Whether the PI is before the doc or not, a typical SAX user would not know nor care about the answer. I would very much like to see a pair of methods that are clearly known to be the first and the last callback methods for initialization and cleanup purpose. Also, I would like to see an opaque handle to the document source passed to the startDocument method. i.e. public void startDocument (Object source); // source can be URL, File, InputStream, etc. Many XML applications will be filter types and would not know what the document source is. One problem is that opaque Object trick does not translate to other languages too well. Don Park xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Jan 6 01:34:51 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Comments (question 7 of 10) Message-ID: <001d01bd1a42$bce722f0$2ee044c6@donpark> Yes. I would like to have comments as out-of-band data. Source control systems can store information there without changing XML data and without violating its validity. Anyway, XML hackers can go nuts with the stuff. Its no sweat to implement and its meaning is very clear. public void comment (String s); Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Jan 6 01:56:47 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Naming and Packaging (question 10 of 10) Message-ID: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> Jon Bosak owns xml.org domain. I sent him a message asking if we could use it. I don't know his latest e-mail address so I sent it to bosak@netcom.com. Is this correct? Don Park xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From antony at n-space.com.au Tue Jan 6 02:43:41 1998 From: antony at n-space.com.au (Antony Blakey) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Comments (question 7 of 10) References: <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> <199801050122.UAA00342@unready.microstar.com> <34B041AA.8D805120@n-space.com.au> <34B0A549.F3B1E0A4@technologist.com> Message-ID: <34B19A2D.26CC89A7@n-space.com.au> Paul Prescod wrote: > The way this is often done is to write a DTD for DTDs, write DTDs in XML > instance syntax and translate that syntax into XML DTD syntax for > machine processing. The strength of XML instance syntax is that it is > infinitely extensible (like any other XML DTD). > > BTW, The weakness of ALWAYS using the XML instance syntax is that it is > infinitely extensible...like HTML. I've been down that path (ala XML-DATA) but we have people writing DTD's who are skilled at document analysis but not overly comfortable with destructuring the content model of an element and converting it to some form of XML instance. It increases the chance of error, and generates extra QA/fix cycles. Our analysis has shown that error minimization mechanisms are the most significant contributor to the sustainable profitability of software (or software-like) companies at our level. I actually would rather this approach myself, and use it whenever I access the DTD within msxml. The skill level wouldn't be an issue if I had time to write a good DTD authoring tool. Nothing I have seen comes close to cutting the mustard. +----------------------------------+ | Antony Blakey | | N-Space Pty Ltd | | Java - CORBA - SGML - XML | | mailto:antony@n-space.com.au | | http://www.n-space.com.au | +----------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Tue Jan 6 13:40:49 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:48 2004 Subject: anti-goals Message-ID: <01bd1aa8$6f403a30$a00b0ac0@pharcyde.poetsoftware.xo.com> I am frankly amazed at how productive this discussion has been up to this point, which is a good argument for maintaining simplicity in the interface. (I also find myself wondering whether it would *really* make such a big difference to add a callback for comments...) This initiative only started a couple of weeks ago, and it seems to me that the preliminary spec is essentially completed (keeping in mind that it is never going to please everyone). The focus on simplicity has been essential. Nevertheless, your comment on DOM building is spot on. I would love to see a general consensus that the SAX interface be used as the basis for a more extensive one (AAX?) providing all information necessary to build a DOM (well, actually a DO...). One of the lovely things about object orientation is that the SAX interface can be left as-is and still be reused for an interface providing more functionality. I would strongly argue this point because the central implication of SAX (i.e. that "competition" among parser writers, rendered feasible by interoperability, will lead to higher quality, more capable parsers) is something that we (as repository vendors) would like to benefit from too. For obvious reasons, SAX is useless to us right now. I am a strong believer in not putting the cart before the horse, so the focus on finishing SAX seems exactly right to me. Nevertheless, it would no doubt make it easier to finalize the SAX specification if there is some general consensus on this. My suggestion would be: SAX - for capturing logical structure AAX - for capturing logical and physical structure. Sufficient functionality to build a document object as specified by the DOM. Cheers, Matthew -----Original Message----- From: David Ornstein To: xml-dev Mailing List Date: Monday, January 05, 1998 7:13 PM Subject: SAX: anti-goals >Hello all, > > >In the ongoing design discussion we've been having, it seems that there's a >lack of clarity about what SAX will be used for and what it won't. This is >slowing down and creating confusion in some of the discussions. While >attempting to enumerate goals/requirements is a good idea at the start of a >process, it can sometimes be quite difficult because people get hung up on >being complete. Instead, one approach I sometimes take is to define >"anti-goals." An anti-goal is a declaration of what one is *not* trying to >do/support/achieve in a design. They are not things you're trying to >avoid, in fact some of the best anti-goals are ones that describe things >that would be very nice to have but have agreed to place out of scope. The >incremental accumulation of anti-goals during a design process can help >keep things on track. >And the list can later be used when retrospectively constructing a >list of a project's goals. > > >My read is that we seem to have a strong leaning towards the following >anti-goals: > >* SAX is not being designed to support configurations where the parser is >in one language and the client is in a different language > >* SAX is not being designed to support identity transformations of an XML >document('s physical structure) > >* SAX is not being designed to be usable in building a DOM tree [Actually, >my personal suspicion is that this one is likely to generate lots of hot >debate.] > >Do others agree/disagree with these as our anti-goals? If so, I'd propose >that, to help keep us (and new folks) on track, they be included in any >summaries of the SAX work we put out. > >I don't necessarily like all of these goals, personally, as they >reduce SAX's utility for me. I'm just suggesting that they do seem to >where we're heading. > >David > > >================================ >David Ornstein >Pragmatica, Inc. >http://www.pragmaticainc.com > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 6 14:43:39 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: How many interfaces? Message-ID: <199801061443.JAA00700@unready.microstar.com> For SAX, how much does the number of interfaces and classes matter? My original plan was to have something like this in the *.sax package (I'm still experimenting with different names): public interface *.sax.Parser public interface *.sax.Application public class *.sax.AppBase Given the unsuitability of java.util.Dictionary for attributes, however, it is clear that we will need at least four instead of three: public interface *.sax.Parser public interface *.sax.Application public interface *.sax.ValueMap public class *.sax.AppBase I have been reluctant to create too many classes and interfaces because for most SAX users, XML will just be a minor enabling technology used for transactions, configuration, or other structured data data. The distinctions that we XML specialists want to make -- entity vs. element structure, etc. -- loom large for us, but may seem unnecessarily picky to non-specialists, for whom XML is not really the main point. Nevertheless, it is essential that we not prematurely rule out any approaches, so last night I wrote a different trial implementation, using a couple more interfaces and classes: public interface *.sax.Parser public interface *.sax.EntityHandler public interface *.sax.DocumentHandler public interface *.sax.ErrorHandler public interface *.sax.AttributeMap public class *.sax.HandlerBase This is certainly much more elegant. With this approach, the parser interface appears as follows: public interface Parser { public void setEntityHandler (EntityHandler handler); public void setDocumentHandler (DocumentHandler handler); public void setErrorHandler (ErrorHandler handler); void parse (String publicID, String systemID) throws java.lang.Exception; } We've lost the ability to extend java.lang.Runnable, but we've gained the ability to throw exceptions if we wish (Java only), together with the ability to set separate event handlers for each area. As a professional SGML/XML implementor and system architect, I am very comfortable with this sort of approach, but I don't know how it will play with typical Java hackers: - Will this arrangement be too hard to understand? - Will it look like we're being XML purists and splitting too many theoretical hairs, for something that should be dead simple (what they consider to be just a single data format)? - Will applet writers be willing to include this many extra *.class files just for XML support? There remain strong pragmatic arguments for the everything-in-a-single-interface approach. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deitiker at agave.com Tue Jan 6 14:56:18 1998 From: deitiker at agave.com (Glenn Deitiker) Date: Mon Jun 7 16:59:48 2004 Subject: No subject Message-ID: <199801061456.IAA15092@mezcal.agave.com> unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Tue Jan 6 16:50:23 1998 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 16:59:48 2004 Subject: anti-goals In-Reply-To: <01bd1aa8$6f403a30$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <3.0.3.32.19980106114926.02e87018@pop.mindspring.com> At 02:38 PM 1/6/98 +0100, Matthew Gertner wrote: >the central implication of SAX (i.e. that "competition" among parser >writers, rendered feasible by interoperability, will lead to higher quality, >more capable parsers) is something that we (as repository vendors) would >like to benefit from too. For obvious reasons, SAX is useless to us right >now. I agree with both of Matthew's points here: 1. You may well want to get something out with the functionality you are currently considering; 2. This level of functionality is useless for repository vendors like Texcel or POET, because we need to reconstruct the entire document. The same is probably true of editor vendors. I would be interested in knowing what kinds of applications you envision for SAX. To me, it seems like it might be useful for applications that extract data from XML files, e.g. for electronic commerce. What other kinds of applications do you have in mind? Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 6 19:12:49 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:48 2004 Subject: anti-goals In-Reply-To: <3.0.3.32.19980106114926.02e87018@pop.mindspring.com> References: <01bd1aa8$6f403a30$a00b0ac0@pharcyde.poetsoftware.xo.com> <3.0.3.32.19980106114926.02e87018@pop.mindspring.com> Message-ID: <199801061913.OAA01704@unready.microstar.com> Jonathan Robie writes: > I would be interested in knowing what kinds of applications you envision > for SAX. To me, it seems like it might be useful for applications that > extract data from XML files, e.g. for electronic commerce. What other kinds > of applications do you have in mind? Here are some additional examples: - most transformations - producing a rendered version of an XML document (electronic or paper) - context-sensitive searching Your example, extracting data, itself covers a wide range of applications: - database import/export - online transactions - configuration information - meta-data exchange - general client/server protocols Essentially, SAX should cover most general requirements (most transformations occur as part of a processing chain, and do not need saved comments, internal entity references, etc.). Those applications that do need access to lexical information -- mainly authoring tools and repositories -- will, of course, need to use a different or extended API. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Tue Jan 6 20:02:08 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:48 2004 Subject: namespaces (was Re: XML and Using It With Whitespace References: Message-ID: <34B28DB7.6013BB90@mixx.de> greetings. our processor (the one in clos) has a fairly direct implementation of namespaces in terms of lisp packages. as is it happens, they have a similar syntax for names. Chris Smith wrote: > ... > Clearly, you must have the DTD to make sense of the last one! However, > I see a rather interesting side-effect, namely that this one could > likely be added using a namespace. (Tangent: any parsers experimenting > with namespaces?) i had found two proposals at the time i was implementing namespaces, the implementation is closest to the one which i can attribute only to "microsoft" (my copy has no authorship noted) and which dates from 21.10.97. we use the pi-form from the microsoft proposal to denote the creationof the specified package and to load the specified dtd into that package. i found (find still) the " Message-ID: <34B2905C.F6AC5915@mixx.de> greetings, David Megginson wrote: > For SAX, how much does the number of interfaces and classes matter? > My original plan was to have something like this in the *.sax package i'm against limiting the number of interfaces apriori. put the handlers in an inheritance hieracrchy, let the user bind the productions they wish to use to specific classes with the desired, more complex behaviour. the behaviour of those productions in which they have no interested remains bound to a generic class which does "nothing". the application requires only load those classes it actually uses, but it can make this decision at run-time... ! > public interface Parser { > > public void setEntityHandler (EntityHandler handler); > public void setDocumentHandler (DocumentHandler handler); > public void setErrorHandler (ErrorHandler handler); > > void parse (String publicID, String systemID) > throws java.lang.Exception; > > } it would be better to bind the handlers by name, rather than to have a static function space. james, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at arbortext.com Tue Jan 6 20:24:44 1998 From: paul at arbortext.com (Paul Grosso) Date: Mon Jun 7 16:59:48 2004 Subject: namespaces Message-ID: <98Jan6.152116est.18820@thicket.arbortext.com> At 15:02 1998 01 06 -0500, james anderson wrote: >greetings. >our processor (the one in clos) has a fairly direct implementation of namespaces >in terms of lisp packages. as is it happens, they have a similar syntax for >names. > >i had found two proposals at the time i was implementing namespaces, the >implementation is closest to the one which i can attribute only to "microsoft" >(my copy has no authorship noted) and which dates from 21.10.97. The latest document from the XML WG on namespaces is but that is only readable by W3C members. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 6 20:51:38 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: Listadmin: Errors and other happenings In-Reply-To: Message-ID: <3.0.1.16.19980106211047.0f671aba@pop3.demon.co.uk> At 13:13 05/01/98 +0100, Rzepa, Henry wrote: [... helpful analysis of e-mail errors snipped ...] This is sufficiently useful that it is almost worth linking to it from the archive. Is there any way of adding a link to this (or slightly revised page) from the main archive page? >Finally, the Hypermail archive should be back in action now, so >keyword searches etc can resume. I was delighted to see the (apparently) seamless transition so that past messages were not lost to the archive. Many thanks. The hypermail *really* comes into its own for the current SAX discussion where the threads are - for the most part - neatly separated. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 6 20:55:17 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Status Report In-Reply-To: <199801051342.NAA02861@nathaniel.eps.inso.com> References: <3.0.1.16.19980105084810.3a27f52c@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980106213836.10775b8a@pop3.demon.co.uk> At 13:42 05/01/98 GMT, Gavin Nicol wrote: >This is a fine report, but to date, only a few have actually commented. >If someone has a bit more time, it would be excellent to have the >list of questions put into an HTML form... I think many people have >probably already lost the list of questions. I agree with this, Gavin. I think that DavidM should field this one first - one simple way to manage it would be to create a "title" page with links to the actual xml-dev hypermailed postings. [HTML will not transclude these, but when we have XML, of course, it could!]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 6 20:57:06 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: anti-goals In-Reply-To: <01bd1aa8$6f403a30$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <3.0.1.16.19980106211839.1077275c@pop3.demon.co.uk> At 14:38 06/01/98 +0100, Matthew Gertner wrote: >I am frankly amazed at how productive this discussion has been up to this >point, which is a good argument for maintaining simplicity in the interface. >(I also find myself wondering whether it would *really* make such a big >difference to add a callback for comments...) This initiative only started a >couple of weeks ago, and it seems to me that the preliminary spec is >essentially completed (keeping in mind that it is never going to please >everyone). The focus on simplicity has been essential. Thanks. This more or less sums up my own feelings. A line has to be drawn somewhere. > >SAX - for capturing logical structure >AAX - for capturing logical and physical structure. Sufficient functionality >to build a document object as specified by the DOM. If SAX is seen as a part of a continuing process, this seems a possible division . I certainly expect that *as a result of using SAX* there may need to be some tweaks later, and that we shall find out how easy it is to build other things on top. Personally (I think) I need more than the logical structure, but I'm certain that there are a lot of newcomers to XML who do not have a background in document analysis and management and who will only need the logical structure. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 6 20:59:43 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: anti-goals In-Reply-To: <199801061913.OAA01704@unready.microstar.com> References: <3.0.3.32.19980106114926.02e87018@pop.mindspring.com> <01bd1aa8$6f403a30$a00b0ac0@pharcyde.poetsoftware.xo.com> <3.0.3.32.19980106114926.02e87018@pop.mindspring.com> Message-ID: <3.0.1.16.19980106213136.0f67f062@pop3.demon.co.uk> At 14:13 06/01/98 -0500, David Megginson wrote: [...] > >Here are some additional examples: > >- most transformations >- producing a rendered version of an XML document (electronic or > paper) >- context-sensitive searching > >Your example, extracting data, itself covers a wide range of >applications: > >- database import/export >- online transactions >- configuration information >- meta-data exchange >- general client/server protocols > >Essentially, SAX should cover most general requirements (most >transformations occur as part of a processing chain, and do not need >saved comments, internal entity references, etc.). Those applications >that do need access to lexical information -- mainly authoring tools >and repositories -- will, of course, need to use a different or >extended API. I find this list very useful. I'd add to this that SAX should cover the functionality of the XML-related languages XLL and XSL. (Neither of these require reconstruction of the original documents - unless you want to *edit* the *.xsl and preserve comments, etc.). If - as has been suggested in some camps - namespaces and 'schemas' in XML become common, then there is an even greater potential use. [Of course these may be represented *internally* as trees, but (if I'm right) these can be built from the SAX interface (as JUMBO does/will_do) without requiring the full DOM model.] When XML fully takes off there will be a requirement for a lot of software to process the complex mixture of *.xml, XLL and *.xsl that will be received client-side, and I would have thought that SAX was exactly what most people want. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 6 21:02:07 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: How many interfaces? In-Reply-To: <199801061443.JAA00700@unready.microstar.com> Message-ID: <3.0.1.16.19980106214741.3b9fbc54@pop3.demon.co.uk> At 09:43 06/01/98 -0500, David Megginson wrote: >For SAX, how much does the number of interfaces and classes matter? >My original plan was to have something like this in the *.sax package >(I'm still experimenting with different names): [... suggested interfaces snipped...] I can't comment usefully on this. > >As a professional SGML/XML implementor and system architect, I am very >comfortable with this sort of approach, but I don't know how it will >play with typical Java hackers: > >- Will this arrangement be too hard to understand? Taking me as test-bed, 'no', IFF good, running examples are provided. If only the interface is provided I suspect 'yes'. I have managed with Lark, AElfred and NXP *because* they provided a Driver.java, EventDemo.java, etc. which minimally and precisely exercised all their interfaces. I then convert my application gently by tweaking each one in turn to see whether it works for me (answer="yes"). > >- Will it look like we're being XML purists and splitting too many > theoretical hairs, for something that should be dead simple (what > they consider to be just a single data format)? Possibly. But they also come across some abstraction in java.awt - e.g. you can't use Graphics.drawImage() without an ImageObserver argument. I've not had the time to understand this yet, but it I set it to null, nothing awful happens. Similarly the SwingSet is at least as complex as SAX will be. >- Will applet writers be willing to include this many extra *.class > files just for XML support? What's the issue here? The *total bytecount* of the *.class, or the number of *.class files or the number of import statements? None of these would bother me. But I haven't considered performance issues (my own are far worse :-) > >There remain strong pragmatic arguments for the >everything-in-a-single-interface approach. I'll leave this to you - but I wouldn't dissent from simplicity Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 6 21:36:56 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: How many interfaces? In-Reply-To: <3.0.1.16.19980106214741.3b9fbc54@pop3.demon.co.uk> References: <199801061443.JAA00700@unready.microstar.com> <3.0.1.16.19980106214741.3b9fbc54@pop3.demon.co.uk> Message-ID: <199801062137.QAA07781@unready.microstar.com> Peter Murray-Rust writes: > >There remain strong pragmatic arguments for the > >everything-in-a-single-interface approach. > > I'll leave this to you - but I wouldn't dissent from simplicity It seems strange to C++ and other traditional programmers that the number of class files matters so much in Java, but for applets in the current generation of web browsers, it's critical -- each *.class file requires a separate HTTP connection, and depending on server load, there may be several seconds latency for each connection. I generally give up on a web page after about 10 seconds unless it's very important to me (few corporate pages get fully rendered before I hit "Back"), so as an applet writer, I want to introduce as little delay as possible. Java 1.1 defines the JAR format, which allows multiple classes in a single file, but the most widely deployed browsers (Navigator 3.x and MSIE 3.x) do not support it, and I don't know if Microsoft supports JARs even in newer versions, since they have decided to not to upgrade to never versions of Java. NS 3.x supports zip files, and MSIE 3.x supports CAB files, but neither works for both. ?lfred will continue to support a single-file callback interface, so perhaps people who need to limit the number of class files will be forced to use ?lfred's native interface instead of SAX. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From howardk at paradigmdev.com Tue Jan 6 21:42:19 1998 From: howardk at paradigmdev.com (Howard Katz) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: How many interfaces? Message-ID: <57B675B21506D1118BAB0060081C295D23A056@vserver.paradigm.com> Microsoft does support Jar files. They have to in order to support beans (which they also do, strange and wonderful as it might seem). Howard > -----Original Message----- > From: David Megginson [SMTP:ak117@freenet.carleton.ca] > Sent: Tuesday, January 06, 1998 1:37 PM > To: xml-dev Mailing List > Subject: Re: SAX: How many interfaces? > > Peter Murray-Rust writes: > > > >There remain strong pragmatic arguments for the > > >everything-in-a-single-interface approach. > > > > I'll leave this to you - but I wouldn't dissent from simplicity > > It seems strange to C++ and other traditional programmers that the > number of class files matters so much in Java, but for applets in the > current generation of web browsers, it's critical -- each *.class file > requires a separate HTTP connection, and depending on server load, > there may be several seconds latency for each connection. > > I generally give up on a web page after about 10 seconds unless it's > very important to me (few corporate pages get fully rendered before I > hit "Back"), so as an applet writer, I want to introduce as little > delay as possible. Java 1.1 defines the JAR format, which allows > multiple classes in a single file, but the most widely deployed > browsers (Navigator 3.x and MSIE 3.x) do not support it, and I don't > know if Microsoft supports JARs even in newer versions, since they > have decided to not to upgrade to never versions of Java. NS 3.x > supports zip files, and MSIE 3.x supports CAB files, but neither works > for both. > > ?lfred will continue to support a single-file callback interface, so > perhaps people who need to limit the number of class files will be > forced to use ?lfred's native interface instead of SAX. > > > All the best, > > > David > > -- > David Megginson ak117@freenet.carleton.ca > Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 00:31:02 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: SAX: Whitespace Handling (question 5 of 10) In-Reply-To: <01bd19e4$91a3a220$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3.0.1.16.19980107010121.0f67b238@pop3.demon.co.uk> At 14:16 05/01/98 -0000, Michael Kay wrote: >>BTW: IMHO, IFF there is going to be a "default implementation" anyway, I >>would actually prefer an "ignorableWhitespace" method which calls charData >>by default. This will permit cleaner implementations. > > >I may be simple-minded, but surely the default action with ignorable white >space should be to ignore it? Not simple-minded :-) The whitespace issue is not trivial, but is (I think) consistent. The *parser* has no option except to pass all characters that are not markup to the application. This means that in: A parser MUST pass the equivalent of \n\s\s\n to the application. In a well-formed document there is NO indication of which character data are/are_not significant ("ignorable") so by default the application will have a tree structure where FOO has 3 children. FOO "\n\s\s" BAR "\n" If the application is told through stylesheets/PIs/hardcoded_semantics/telepathy/a_human that all whitespace is ignorable, fine - but it is NOT part of the XML spec. If the DTD reads: the "validating parser" (and we are still struggling with exactly what one of those is :-) MUST tell the application: "Hey! Be careful! I've sent you a FOO, but it has element-only content, so you may wish to ignore all the whitespace-only children of the FOO". The application should say thank you, and then do whatever it feels like doing with this information. HOW the parser tells the application is what we are tackling. DavidM has suggested that when the "ignorable whitespace" is emitted from the parser, it generates a special event. This seems reasonable - I suppose there could be other methods (even simply announcing which elements had element-only content should be sufficient). [Please shoot this down if I've got it wrong :-)]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 00:32:48 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: Lark 1.0 final beta and Larval 0.8 In-Reply-To: <3.0.32.19980105110742.008624d0@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980107010850.565f481a@pop3.demon.co.uk> At 11:17 05/01/98 -0800, Tim Bray wrote: >This isn't finished yet, but I am uncomfortable about the fact that >for the last couple of months, there has not been a java-language XML >syntax-checker that is really very close to the spec. So, at > > http://www.textuality.com/Lark/ Great. I have downloaded and hope to hack it tomorrow. > >I have placed the Lark 1.0 final beta, and release 0.8 of Larval, >a validating XML processor based on Lark. >These class files have been compiled with Microsoft VJ++1.1 and >tested with Microsoft JView and with Sun's Java from JDK 1.1.3. >At the moment, if I compile with the Sun fastjavac, then neither >the Sun nor Microsoft java interpreters can use the resulting >class files. Admittedly, Lark.java and Larval.java are a pretty >severe strain on a compiler; on the other I know about some pretty >egregious violations of the Java language spec that will get by >both of those compilers. I suspect that my current problem with >fastjavac is as likely to be me breaking some rule about what can >be in a static string (J++ is forgiving) as it is a compiler bug. I have had a problem with jvc compiling complex code (a matrix diagonalisation converted from Fortran) where it threw an internal compiler error. And the Lark problem where I have to compile with jvc rather than javac. I imagine that it's a good idea in general to run code through as many compilers as possible and I make sure that mine works with jvc and javac. [...] >The validation implementation is pretty naive. Rather than compiling >tables, Larval builds a data structure more or less isomorphic to >the declaration in the DTD, and then laboriously pokes around in it >every time it sees a start/end tag. I think it proves that (a) a >naive implementation of validation can be done, and (b) this isn't Good! >the right way to do it in the long-term. However, it's nowhere >near as slow as I expected, and is good enough to be useful already >in debugging XML documents. This will also be useful in editing XML I hope. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 00:36:03 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:48 2004 Subject: XML and Using It With Whitespace In-Reply-To: References: <3.0.1.16.19971211143739.37172be2@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980107012940.565f9dc4@pop3.demon.co.uk> Since no one has responded publicly Chris, here's my take on your concerns... At 02:06 05/01/98 -0500, Chris Smith wrote: > >Sorry if the subject is confusing, but it's a really concise proposal >for getting whitespace through - guaranteed. > >I earlier posted a query about the behaviour of various parsers >surrounding whitespace. I guess I'm not as hopeful as I was earlier, >at least based on the answers I received. Thanks to those who took the >time to reply. > >Essentially, I had hoped that using to replace a space would >allow for the creation of a 'magic' difference, the same way that >the < and < are treated differently. Ideally all spaces could >become and we could use the (invalid!) xml:space="none", leaving >only the behind. I am not sure this buys you anything. The is presumably occurring in content. If it occurs in mixed content or ANY it will be emitted by the parser as " " and would look the same as if you had put ordinary spaces in the document. If it occurs in element content (no character data allowed as children) then if the parser accepts it as whitespace it will be treated as if it was a " ". [I still have my concerns as to *where* the spec explicitly allows " " in element content...] [...] > >I think parsers can still correctly read such files. But it points to >a more general problem. If I read such a file with a parser, how can I >write it out again exactly (and I mean *exactly*) the way it was read? >If the parser doesn't indicate clearly where substitutions with >entities were done, then I can't put them back in the file. The same >problem occurs with empty elements. Although the XML spec wants to >imply that and are the same, some might see them as >the difference between a zero-length content and null content. Either >way, if the original XML contains , then that is what >should go back out. If it later contains then the both >references should remain different from each other and unchanged. There has been discussion on this and my understanding that the unequivocal policy is that and result in exactly the same events or grove and there is NO way of distinguishing which the original document contained. Some people regret this, but the decision is clear. > >To wrap up the options, I'll run through the same paragraph using >three different techniques. > >2....Using character entities - still my favourite, since they work in >attributes as well. Out of all of them, this, to my eyes, looks like >it could easily have been placed in the XML 1.0 spec without breaking >anything else that is in the spec, simply by adding the >xml:space="none". &spc; could be and &lf; is so no new >entities would have to be added. xml:space="none" is NOT allowed in the XML spec. > >

Finally,&spc;the&spc;other&spc;idea&spc;is&spc;the >&spc;one&spc;at&spc;the&spc;bottom&spc;-&spc;use&spc;elements&spc;for&lf; >spaces,&spc;tabs,&spc;and&spc;lineends.&spc;&spc;There&spc;is&spc;a&spc; >single&spc;attribute&spc;n&spc;to&spc;indicate&lf;repeat&spc;counts.

Assuming that you have something like: Then the paragraph above will be result in the same parser output as if they had been spaces (except that it might report the internal entity events). > >3.....With only elements. > >

Finally,theotherideaisthe >oneatthebottom-useelementsfor >spaces,tabs,andlineends.Thereisasingle >attributentoindicaterepeatcounts.

If you really care about every character this is a reasonable way of doing it, but it will generate a large number of events or (in a tree) require a lot of nodes to be created. Both will impact performance. Part of the problem arises from the requirement (which I strongly support) that "XML documents should be human legible and reasonably clear". In some cases something has to be sacrificed and it looks like you are happy to let this one go... > >Clearly, you must have the DTD to make sense of the last one! However, >I see a rather interesting side-effect, namely that this one could >likely be added using a namespace. (Tangent: any parsers experimenting >with namespaces?) Parsers are NOT allowed to experiment with namespaces :-). Parsers must recognise ":" as a valid name character. That's all. Humans can experiment with namespaces. So can applications. PaulG has pointed out that the latest namespace proposal is confidential, so discussion of that is inappropriate. However, going on the information in the public domain (e.g. the RDF draft) JUMBO has implemented a namespace experiment. For what you are doing, I suspect stylesheets would be more valuable. > >In summary, the distinction is, as a reply noted, between "wanted" >whitespace and "unwanted" whitespace. The XML specification wants to >leave it to the application because there are far more 'whitespace >convention sets' than it is desirable to put in the spec. However, >there are far more applications than there are 'whitespace convention >sets', and the application designer wants to pick one, not reinvent >the wheel. I fully agree with this, and if no one else makes proposals... But we need to concentrate on SAX at the moment. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Jan 7 01:27:56 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: How many interfaces? -- A sidenote on CAB and ZIP Message-ID: <007b01bd1b0a$e5a06940$2ee044c6@donpark> David, >have decided to not to upgrade to never versions of Java. NS 3.x >supports zip files, and MSIE 3.x supports CAB files, but neither works >for both. APPLET tags which specifies both a CAB as well as a ZIP will work on both browsers because Netcape ignores CAB and IE ignores ZIP. So the number of classes does not really matter except that bytecode verifier will run faster and SAX programmers will have less things to 'not' worry about. Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 7 01:50:54 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: notes on sax In-Reply-To: <21354461917270@pragmaticainc.com> References: <34B08F6A.764A5AD9@mixx.de> <3.0.1.16.19980106200758.44370e0a@pop3.demon.co.uk> <21354461917270@pragmaticainc.com> Message-ID: <199801070150.UAA00303@unready.microstar.com> David Ornstein writes: > >There is clear agreement for providing support beyond Java, > There is? I haven't heard that yet. I've heard that some people > would like it (me, for example), but I haven't heard collectively > that we've decided one way or the other. Yes, just before Christmas we agreed on the list that SAX would not contain anything that precludes non-Java OO implementations; in particular, nothing in SAX will require dynamic type-checking. Since then, we have also agreed that SAX will not rely on any Java-specific classes such as java.net.URL or java.lang.Dictionary (though the Java implementation itself will, of course use any classes that make sense). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 7 02:01:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: Reporting empty elements In-Reply-To: <3.0.1.16.19980107012940.565f9dc4@pop3.demon.co.uk> References: <3.0.1.16.19971211143739.37172be2@pop3.demon.co.uk> <3.0.1.16.19980107012940.565f9dc4@pop3.demon.co.uk> Message-ID: <199801070201.VAA00324@unready.microstar.com> Peter Murray-Rust writes: > There has been discussion on this and my understanding that the > unequivocal policy is that and result in exactly > the same events or grove and there is NO way of distinguishing > which the original document contained. Some people regret this, but > the decision is clear. One problem is that the PR does not fully define the information set that an XML parser is required to return to an application (there a few scattered rules, such as the ignorable-whitespace rule). I'd suggest that the difference between and is lexical rather than structural; an interface like SAX, that operates mainly on logical structure, should not report the difference; an interface that preserves lexical features (such as comments, internal entity references, etc.) might provide access to the original form. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Jan 7 02:55:25 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:49 2004 Subject: XML and Using It With Whitespace Message-ID: <3.0.32.19980106185404.008e2ad0@pop.intergate.bc.ca> At 02:06 AM 05/01/98 -0500, Chris Smith wrote: >That doesn't mean I'm abandoning the idea - the message authentication >we're doing is important enough to the application that I'm prepared >to sacrifice the use of all the parsers to get the above behaviour. Hmm, I'm failing to get some aspect of your problem. Maybe I bypassed a message in which you explained it. It is clear that *any* conformant parser must give you all the whitespace in the message. It may also send a side-note along telling you that it's not significant, but you always get it or else you're perfectly justified in complaining to your processor author. This has the nice effect that the amount of whitespace you get from the processor is guaranteed to be the same whether it's using a DTD or not. Now, the downside of this is that you can't do and have that treated as identical to i.e. no auto-magic facility to ignore pretty-printing. No XML processor in the world, regardless of the DTD in play is allowed to refrain from passing you the line-breaks and spaces in the first example. Probably I'm missing something... what is the missing piece from your point of view? -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Wed Jan 7 03:49:51 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Naming and Packaging (question 10 of 10) In-Reply-To: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> Message-ID: <199801070348.TAA07093@boethius.eng.sun.com> [Don Park:] | Jon Bosak owns xml.org domain. I sent him a message asking if we | could use it. I don't know his latest e-mail address so I sent it to | bosak@netcom.com. Is this correct? Yes, though that six-year-old address serves mostly as a sink for every junk mailer in the world. I reserved xml.org for exactly the kind of public use being discussed here and have no other plans for it. The main idea was to prevent someone from doing dumb things with the name; I had a vague idea that it might be useful for standard DTDs or a root URN domain for XLinks or something like that. The domain does not correspond to any real server at the moment; there isn't even a default email account. But if it's felt to be useful as a reserved piece of the DNS name space, I would be happy to dedicate it to the use of the XML developer community. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Jan 7 06:04:29 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:59:49 2004 Subject: Modules (was Re: namespaces ) Message-ID: <199801070608.RAA22070@jawa.chilli.net.au> > From: james anderson > i had found two proposals at the time i was implementing namespaces, the > implementation is closest to the one which i can attribute only to "microsoft" > (my copy has no authorship noted) and which dates from 21.10.97. > we use the pi-form from the microsoft proposal to denote the creationof the > specified package and to load the specified dtd into that package. You may also be interested in the "Module" proposals, which were suggested for inclusion into SGML. The idea started in Japan and has floated around getting firmer and simpler over the last year of XML discussions. The idea is simply to use any parameter entity name as a module prefix. So you could have, for example (though presumably it would use external parameter entities, not internal like in this example) "> %one; "> %one; %two; ]> This is the element type declared in PE "one", as expected. This is also the element type declared in PE "one". This is the element type declared in PE "two". This is the element type declared in "one". These merely provide a directory system to resolve names, they don't attempt to provide any schema solution (that is a separate problem). Because parameter entities can contain other PEs, the module prefixes can concatenate (e.g., two::one::x above). (In the SGML proposals, there is actually a special keyword MODULE put as part of the parameter entity declaration, to flag that the PE is also a module, but this is not required.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Wed Jan 7 10:01:18 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:49 2004 Subject: anti-goals Message-ID: <01bd1b52$e9894e80$a00b0ac0@pharcyde.poetsoftware.xo.com> Peter, Thanks for your reply. Having said that SAX is essentially useless for us right now, I would like to clarify my point of view. I am *very* sympathetic with the goals of SAX. I can see tons of applications areas unrelated to our SGML work where POET, for example, could effectively use XML for metadata, configuration information, import/export, etc., as per the officially stated SAX goals. I am sure this will be the primary usage area for XML, in terms of number of users. My point should have been that it will be easier to nip ongoing discussions about supported features in the bud if there is an "officially" stated intention to provide an advanced interface as well. I would like to sit down and write down some of our requirements for a "repository loader" interface (no doubt very similar to a DOM builder), but I see the danger of losing focus at this point. I remember the date January 12th floating around as the deadline for some concrete SAX implementations. Would this be an appropriate time to make some more detailed comments about an extended interface? One other issue: if there is any kind of general agreement that a more advanced interface, derived from SAX; would be a worthwhile area for future work, then the use of a well-defined set of interfaces (David Megginson's proposal) is very much to be preferred. Extensibility is the primary goal of good design. Sure there are some pragmatic issues with downloading class files, but a) this only has to be done once per parser and b) every Java application is going to have this problem and concrete solutions are already planned. Cheers, Matthew -----Original Message----- From: Peter Murray-Rust To: Matthew Gertner ; xml-dev Mailing List Date: Tuesday, January 06, 1998 10:11 PM Subject: Re: anti-goals >Thanks. This more or less sums up my own feelings. A line has to be drawn >somewhere. > >> >>SAX - for capturing logical structure >>AAX - for capturing logical and physical structure. Sufficient functionality >>to build a document object as specified by the DOM. > >If SAX is seen as a part of a continuing process, this seems a possible >division . I certainly expect that *as a result of using SAX* there may >need to be some tweaks later, and that we shall find out how easy it is to >build other things on top. > >Personally (I think) I need more than the logical structure, but I'm >certain that there are a lot of newcomers to XML who do not have a >background in document analysis and management and who will only need the >logical structure. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 7 11:44:18 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: XML and Using It With Whitespace In-Reply-To: <3.0.32.19980106185404.008e2ad0@pop.intergate.bc.ca> References: <3.0.32.19980106185404.008e2ad0@pop.intergate.bc.ca> Message-ID: <199801071144.GAA00284@unready.microstar.com> Tim Bray writes: > Now, the downside of this is that you can't do > > > > > > > and have that treated as identical to > > > > i.e. no auto-magic facility to ignore pretty-printing. > No XML processor in the world, regardless of the DTD in play is allowed > to refrain from passing you the line-breaks and spaces in the > first example. True, but with a DTD and a DTD-aware XML parser, your application can easily choose discard that additional whitespace. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 7 11:56:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Java Package org.xml.sax In-Reply-To: <199801070348.TAA07093@boethius.eng.sun.com> References: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> <199801070348.TAA07093@boethius.eng.sun.com> Message-ID: <199801071156.GAA00346@unready.microstar.com> Jon Bosak writes: > I reserved xml.org for exactly the kind of public use being discussed > here and have no other plans for it. The main idea was to prevent > someone from doing dumb things with the name; I had a vague idea that > it might be useful for standard DTDs or a root URN domain for XLinks > or something like that. The domain does not correspond to any real > server at the moment; there isn't even a default email account. But > if it's felt to be useful as a reserved piece of the DNS name space, I > would be happy to dedicate it to the use of the XML developer > community. I'd like to take the opportunity to thank both Jon Bosak and Tim Bray publicly for generously offering the use of their xml.* domain names for the Java implementation of SAX. Although in fact neither xml.org nor xml.com currently corresponds to any actual company or organisation, and both would have been appropriate choices, I think that on balance an *.org domain gives a greater _appearance_ of neutrality than a *.com domain, and even the mere appearance of neutrality is essential in a highly-politicised climate like the Java world: as a result, I propose that the Java implementation of SAX use the package "org.xml.sax": org.xml.sax.Parser org.xml.sax.Application org.xml.sax.ValueMap org.xml.sax.AppBase or org.xml.sax.Parser org.xml.sax.EntityHandler org.xml.sax.DocumentHandler org.xml.sax.ErrorHandler org.xml.sax.AttributeMap org.xml.sax.HandlerBase All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 7 12:04:58 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: anti-goals In-Reply-To: <01bd1b52$e9894e80$a00b0ac0@pharcyde.poetsoftware.xo.com> References: <01bd1b52$e9894e80$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <199801071205.HAA00382@unready.microstar.com> Matthew Gertner writes: > My point should have been that it will be easier to nip ongoing > discussions about supported features in the bud if there is an > "officially" stated intention to provide an advanced interface as > well. I would like to sit down and write down some of our > requirements for a "repository loader" interface (no doubt very > similar to a DOM builder), but I see the danger of losing focus at > this point. I remember the date January 12th floating around as the > deadline for some concrete SAX implementations. Would this be an > appropriate time to make some more detailed comments about an > extended interface? The new architecture that I proposed yesterday would make this very simple. I proposed that we have three different Java interfaces for call-backs (I can now supply package names as well): org.xml.sax.EntityHandler org.xml.sax.DocumentHandler org.xml.sax.ErrorHandler There are setters for all of these in the parser interface, org.xml.sax.Parser: package org.xml.sax; public interface Parser { public void setEntityHandler (EntityHandler handler); public void setDocumentHandler (DocumentHandler handler); public void setErrorHandler (ErrorHandler handler); public void parse (String publicID, String systemID) throws Exception; } To create an interface that delivers more information, we could simply define a new interface, say, org.xml.FancyHandler, and then extend the Parser interface: package org.xml.sax; public interface FancyParser extends Parser { public void setFancyHandler (FancyHandler handler); } I see no reason that we cannot turn our attention to this issue once the current SAX is implemented and relatively stable -- at least, we have not shut the door to future enhancement. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Wed Jan 7 14:33:19 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:49 2004 Subject: Modules (was Re: namespaces ) References: <199801070608.RAA22070@jawa.chilli.net.au> Message-ID: <34B391AB.C68D5599@mixx.de> you wrote: > You may also be interested in the "Module" proposals, which were suggested for > inclusion into SGML. The idea started in Japan and has floated around getting > firmer and simpler over the last year of XML discussions. > > The idea is simply to use any parameter entity name as a module prefix. > So you could have, for example (though presumably it would use external > parameter entities, not internal like in this example) it looks intriguing. i'm not certain that it is necessary to make the pseudo-hierarchical namespace automatic. if i understand the illustration correctly, the point is to be able to avoid conflicts among otherwise identical symbols. the same would be possible in a simpler two-level namespace (package::symbol) by appropriately selecting the package names. granted, inclusion by entity reference would not guarantee uniqueness without this, but if, as you noted, the intended application is external entities, one would need only to implement an entity bound to the name of the current namespace in order to be able to construct hierarchical values for the "as" attribute dynamically (wrt the "microsoft" proposal) with a form like on one hand achieving this with the "microsoft" proposal requires more attention, but, on the other, it allows you to choose merge namespaces if so desired, by including an entity, or to separate them by specifying the hierarchy when including via a namespace. do you have a source for more information? thanks, bye xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Wed Jan 7 14:35:16 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:49 2004 Subject: Reporting empty elements References: <3.0.1.16.19971211143739.37172be2@pop3.demon.co.uk> <3.0.1.16.19980107012940.565f9dc4@pop3.demon.co.uk> <199801070201.VAA00324@unready.microstar.com> Message-ID: <34B39293.AC3A5D6A@mixx.de> David Megginson wrote: > One problem is that the PR does not fully define the information set > that an XML parser is required to return to an application (there a > few scattered rules, such as the ignorable-whitespace rule). ... in my, as yet, short presence in this group, quandries of this sort have arisen with great regularity. that among a group of people who have no small amount of experience with the subject matter. at those times, i miss something as clear as a denotational definition for xml. that is, something which expressed the equivalent dom instances / content for all legal xml forms. the reading i've done in the dom draft has been informative, but the relations are (too) often left as an "exercise for the reader". in the long run, such a definition would (have already) save(d) a deal of time and effort. the behaviour of all parsers and processors (including sax) would be much easier to describe, since, even in cases where they don't handle the full "language", it would be clear that, in order to "conform", they would have to either produce at least a consistent dom-subset, or (if a parser) provide sufficient data to produce one from that xml subset which they do handle. given such a definition, even an api as reduced as the sax would be easier to specify. i recognize, that the xml-standard should neither prescribe nor proscribe implementation techniques. i can understand this. i also know that a lot of has been accomplished in a very short time. still, as an implementor, i often think that it would be very nice, to have (already often "had") a standard which were actually that of a "language" rather than that of a "notation". has anyone addressed this task? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 15:31:55 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:49 2004 Subject: Modules (was Re: namespaces ) In-Reply-To: <199801070608.RAA22070@jawa.chilli.net.au> Message-ID: <3.0.1.16.19980107090339.358f5a28@pop3.demon.co.uk> At 17:05 07/01/98 +1100, Rick Jelliffe wrote: [...] >You may also be interested in the "Module" proposals, which were suggested for >inclusion into SGML. The idea started in Japan and has floated around getting >firmer and simpler over the last year of XML discussions. > >The idea is simply to use any parameter entity name as a module prefix. >So you could have, for example (though presumably it would use external >parameter entities, not internal like in this example) > > > "> > > %one; > "> > %one; > %two; >]> > > This is the element type declared in PE "one", as expected. > This is also the element type declared in PE "one". > This is the element type declared in PE "two". > This is the element type declared in "one". > [Please shoot this down if I have missed something...] I think that newcomers to XML may read more magic processing into this than exists. In the example given, an XML parser will see TWO declared elementTypes (x and y). There is no declaration of one::x, two::x, two::one::x. Indeed if the document were given to a validating parser this would presumably report that "one::x has not been declared". An XML parser on seeing simply sees it as a 6 character elementType. It has no mechanism for relating it to . So IMO the second sentence is not correct, *for the XML parser* - they are two different beasts. Note in particular that XML does not allow substitution within the start tag in the document. Unless the XML parsers passes the PE information to the application, the app has no way of knowing that and are semantically linked. A common mechanism for this mapping - architectural forms (AFs) - is not currently part of the XML or X*L specs. If I have this right, it's important to realise that parsers have nothing to do with namespaces, and that any namespace handling is application dependent. If the WG or others come up with namespace proposals, I would expect them to be implemented elsewhere than the parser. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 15:33:04 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Naming and Packaging (question 10 of 10) In-Reply-To: <199801070348.TAA07093@boethius.eng.sun.com> References: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> Message-ID: <3.0.1.16.19980107084444.358f5aa0@pop3.demon.co.uk> At 19:48 06/01/98 -0800, Jon Bosak wrote: [...] > >I reserved xml.org for exactly the kind of public use being discussed >here and have no other plans for it. The main idea was to prevent >someone from doing dumb things with the name; I had a vague idea that >it might be useful for standard DTDs or a root URN domain for XLinks >or something like that. The domain does not correspond to any real >server at the moment; there isn't even a default email account. But >if it's felt to be useful as a reserved piece of the DNS name space, I >would be happy to dedicate it to the use of the XML developer >community. This is a lovely offer, Jon - and it seems ideal and appropriate for the current problem. I am also really impressed by you foresight in thinking about XML 6 years ago and this would be a fitting tribute. P. > >Jon > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Jan 7 15:57:56 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:59:49 2004 Subject: Modules (was Re: namespaces ) Message-ID: <199801071602.DAA32360@jawa.chilli.net.au> > From: Peter Murray-Rust > [Please shoot this down if I have missed something...] No, you are entirely correct, as far as I understand how it fits together with XML 1.0. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Wed Jan 7 16:03:20 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:49 2004 Subject: Modules (was Re: namespaces ) References: <3.0.1.16.19980107090339.358f5a28@pop3.demon.co.uk> Message-ID: <34B3A732.49097444@mixx.de> Peter Murray-Rust wrote: > I think that newcomers to XML may read more magic processing into this than > exists. In the example given, an XML parser will see TWO declared you can count me among them... > elementTypes (x and y). There is no declaration of one::x, two::x, > two::one::x. Indeed if the document were given to a validating parser this > would presumably report that "one::x has not been declared". > ... > > If I have this right, it's important to realise that parsers have nothing > to do with namespaces, and that any namespace handling is application > dependent. If the WG or others come up with namespace proposals, I would > expect them to be implemented elsewhere than the parser. 'cause i coundn't figure out how i was supposed to have the parser validate documents which incorporated difference namespaces without supporting it identifying elements according to namespace-qualified names... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 16:55:30 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Java Package org.xml.sax In-Reply-To: <199801071156.GAA00346@unready.microstar.com> References: <199801070348.TAA07093@boethius.eng.sun.com> <003f01bd1a45$cd2a02b0$2ee044c6@donpark> <199801070348.TAA07093@boethius.eng.sun.com> Message-ID: <3.0.1.16.19980107170549.310f4088@pop3.demon.co.uk> At 06:56 07/01/98 -0500, David Megginson wrote: >Jon Bosak writes: > > > I reserved xml.org for exactly the kind of public use being discussed > > here and have no other plans for it. The main idea was to prevent > > someone from doing dumb things with the name; I had a vague idea that > > it might be useful for standard DTDs or a root URN domain for XLinks This would be extremely useful. In searching for appropriate namespaces for DTDs one has to latch onto some form of "organisation". I can see the attraction of this for - say - CML (Chemical Markup Language) until perhaps a learned society or other org takes over this role. > > or something like that. The domain does not correspond to any real > > server at the moment; there isn't even a default email account. But > > if it's felt to be useful as a reserved piece of the DNS name space, I > > would be happy to dedicate it to the use of the XML developer > > community. > >I'd like to take the opportunity to thank both Jon Bosak and Tim Bray >publicly for generously offering the use of their xml.* domain names >for the Java implementation of SAX. I support this. There clearly needs to be some filtering of what can be put there and if XML-DEV can help with this we'd be very happy. >Although in fact neither xml.org nor xml.com currently corresponds to >any actual company or organisation, and both would have been >appropriate choices, I think that on balance an *.org domain gives a >greater _appearance_ of neutrality than a *.com domain, and even the >mere appearance of neutrality is essential in a highly-politicised >climate like the Java world: as a result, I propose that the Java >implementation of SAX use the package "org.xml.sax": I agree. There is an attraction in having something that is obviously non-aligned. And there is always the possibility of more formal support later. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lex at www.copsol.com Wed Jan 7 22:45:43 1998 From: lex at www.copsol.com (Alex Milowski) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Status Report In-Reply-To: <199801050318.WAA01031@unready.microstar.com> from "David Megginson" at Jan 4, 98 10:18:48 pm Message-ID: <199801072241.QAA03227@copsol.com> > CORE EVENTS > ----------- > > So far, there seems to be general agreement on the following event > callbacks for the XmlApplication interface: > > public void startDocument (); > public void endDocument (); > public void endElement (String name); > public void processingInstruction (String name, String remainder); > > There is general agreement that the following two should be present, > but still discussion over their exact form (I'm still tweaking the > names a bit): > > public void characters (char ch[], int start, int length, ...?); > public void startElement (String name, ...?); > > (For the first, there is the question of a flag for ignorable > whitespace, and for the second, the question of how to report > attributes). By "core events", do you mean that only a subset of the APIs would be available? > ENTITIES > -------- > > There has been a lively and well-informed discussion on entity > handling. Many participants are comfortable with something like the > following for external entities (including the external DTD subset, > which may contain processing instructions): > > public void startEntity (String ename, String publicID, String systemID); > public void endEntity (String ename); > > (There is also a question about whether public IDs should be > provided). Some others suggest that SAX should provide no information > about external entities, while others suggest that the XmlParser > interface should have a getLocation() method instead. The main > motivation for providing external-entity information (aside from error > reporting) is to resolve relative URIs in attribute values. IMHO, parsers should know *nothing* about resolving entities. Resolving entities is an orthogonal problem. > > On the issue of entity resolution, there has been less feedback, > probably because the topic is a little confusing. I have suggested > something like this > > public String resolveEntity (String ename, String publicID, String systemID); > > which would allow simple URI substitution and resolution of public > identifiers, if desired (in most cases, you could simply return the > systemID argument unmodified). Another suggestion is a separate > EntityManager interface which would allow much more functionality. The separate entity manager interface is how both SP and my dsssl.parser APIs in the DSSSL Developer's Toolkit work. I'd highly recommend this. You can create "reference" entity managers that lookup a URI and use this by default. > ERROR REPORTING > --------------- > > A majority of participants seem to support using callbacks for error > reporting, partially to simplify cross-language support: > > public void warning (String message, int line, int column); > public void fatal (String message, int line, int column); > > Note the addition of the 'column' argument -- it has rightly been > pointed out that XML documents can consist of a single, long line, so > the line number itself may be useless. If we do not have some general > way to determine the current entity (i.e. startEntity and endEntity), > we will also have to supply the URI of the current entity here. In both SP and the DSSSL Developer Toolkit, there is some abstract object or interface that is the "Message Reporter". This object is given to the parser when the parse happens or the parse is created. Again, this is an issue of orthogonality. > PROLOG > ------ > > No one sees a need for startProlog and endProlog events, but several > people would like to see an event for the DOCTYPE, if present: > > public void doctype (String name, String publicID, String systemID); > > where publicID and systemID refer to the external DTD subset, if any. > This would help with autodetection of different document types. I see a need for this. In a proper interpretation of the prolog of the document you have the following sequence: (stuff) start-doctype internal-subset end-doctype potentially process external-subset (stuff) document-element How do you delimit the internal subset from the external subset without ending the document type declaration. Remember: document type declaration != document type The document type is defined by the combination of the internal and external subsets. > COMMENTS > -------- > > Most people agree that there is no need for SAX to report comments. Yes there is a need! If you do not report about comments, how might one actually edit or process those comments? An event API should encompass in some way *all* the information in the document. We have a finite number of constructs in XML. Define an interface for all of these constructs and be done with it. If you don't do it now, it may never get done. Also, by saying "we don't need that information" and potentially *never* getting access to such information you beg the question why such a construct is in XML at all. Hence, if you see the necessity for comments to be in XML, the same necessity dictates that it must be in your API. > PARSER > ------ > > Everyone seems to like the idea of a common parser interface. Yes! ...I thought XAPI-J was about this. > ARTIST'S RENDITION > ------------------ > > Things are still up in the air, but here is some indication of what > SAX's central XmlApplication interface might look like in Java: > > > /* Beginning of XmlApplication.java */ > > public interface XmlApplication { > > // > // Entities > // > public String resolveEntity (String ename, String publicID, String systemID); > public void startEntity (String ename, String publicID, String systemID); > public void endEntity (String ename); > > // > // Document structure > // > public void startDocument (); > public void endDocument (); > public void doctype (String name, String publicID, String systemID); > public void startElement (String name /* and attributes, somehow */); > public void endElement (String name); > public void characters (char ch[], int start, int length, boolean ignorable); > public void processingInstruction (String name, String remainder); > > // > // Error reporting > // > public void warning (String message, int line, int column); > public void fatal (String message, int line, int column); > > } > > /* end of XmlApplication.java */ Why think of this as an XML Application? What we are talking about here are components--ones which might be part of a larger system. Hence, "application" is a misnomer. ...on a similar note, having been too busy lately to keep up. What is the different between SAX and XAPI-J. Where did SAX come from? What are the requirements, design patterns, etc? ...a URL for the above, maybe? Obviously, I'm confused! ;-) ============================================================================== R. Alexander Milowski http://www.copsol.com/ alex@copsol.com Copernican Solutions Incorporated (612) 379 - 3608 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 7 23:14:44 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Status Report In-Reply-To: <199801072241.QAA03227@copsol.com> References: <199801050318.WAA01031@unready.microstar.com> Message-ID: <3.0.1.16.19980108000156.3dc77cc2@pop3.demon.co.uk> Hi Alex, At 16:41 07/01/98 -0600, Alex Milowski wrote: >...on a similar note, having been too busy lately to keep up. > >What is the different between SAX and XAPI-J. Where did SAX come from? >What are the requirements, design patterns, etc? > >...a URL for the above, maybe? > >Obviously, I'm confused! ;-) > Like you, I was sad that XAPI-J didn't get adopted more widely, because a lot of hard thought had gone into it. The "general consensus" - and I am only guessing this through discussions on this list and a few RL encounters - seems to be that XAPI-J is too close to the DOM and perhaps too involved for many applications. I therefore thought it was worth raising this again - about a month ago - and there was a lot of general enthusiasm. David Megginson and Tim Bray offered to pursue a "simple API" - hence SAX, and we have generated at lot of valuable input on this list. In my opinion we have homed in on something valuable, but simpler that XAPI-J. DavidM and I are jointly producing some (slightly retrofitted) goals to measure our current position against. David devised 10 questions which he requested comments on, and we have been analysing these over the last few days. You are welcome to submit comments - you'll find the Q's on this list (http://www.lists.ic.ac.uk/hypermail/xml-dev). Do not feel that the XAPI-J effort was 'wasted' - perhaps it wasn't exactly right for that particular time, but it had an important effect on the construction of interfaces. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gfrer at luna.nl Wed Jan 7 23:25:10 1998 From: gfrer at luna.nl (Gerard Freriks) Date: Mon Jun 7 16:59:49 2004 Subject: Fwd: XPublish beta (XML Web Publishing System) Message-ID: >From: terje@in-progress.com >X-Sender: terje@cerfnet.com >Mime-Version: 1.0 >Date: Mon, 5 Jan 1998 21:14:21 -0800 >To: Apple-Net-Announce@public.lists.apple.com >Subject: XPublish beta (XML Web Publishing System) >Sender: owner-apple-net-announce@public.lists.apple.com >Precedence: bulk > >ANNOUNCING XPUBLISH: FIRST MAC XML WEB PUBLISHING SYSTEM > >San Fransisco, CA, January 6, 1998 (MacWorld): Media Design in*Progress >today announced XPublish, the first website publishing system for Mac that >is based on the XML standard. Webmasters can use XPublish to efficiently >develop and maintain middle sized and large websites to deploy on any >platform. The XPublish application can be downloaded from: > > > >XPublish generates HTML pages from XML documents at publishing time. Thus, >web developers can benefit from the efficiency and flexibility of XML when >creating sites for todays browsers. XPublish gives web developers the >advantages of one-source publishing without the overhead and complexity of >a full SGML publishing system. > >The integrated Cascading Style Sheets (CSS) editor simplifies designing and >maintaining a consistent presentation style for the site. XPublish >automatically adds a style sheet to all published pages, and can optionally >emulate CSS to suggest an appearance also for older browsers that don't >support style sheets. > >Media Design in*Progress will release XPublish 1.0 shortly after the World >Wide Web Consortium has made XML an official recommendation. The XML >standard is currently in final review, and is expected to become an >official W3C recommendation by January 19. > >ABOUT XML > >Extensive Markup Langauage (XML) is an electronic publishing and data >interchange format created and developed by the World Wide Web Consortium >(W3C). It is technically a simplified dialect of the SGML text processing >standard, on which HTML is an application. XML is intended for use on the >World Wide Web, for vendor-neutral data interchange, media-independent >(one-source) publishing, collaborative authoring, and processing of web >documents by intelligent agents. A press release about XML is found at: > >http://www.w3.org/press/XML-PR > >ABOUT MEDIA DESIGN IN*PROGRESS > >Media Design in*Progress is a developer of professional web software based >on open standards for webmasters, web designers and web publishers. Our >flagship product is Interaction, the first application to use XML for >dynamic social websites that adapt to the visitor. Media Design in*Progress >also tailors web solutions to individual clients. The company is privately >owned. Originally founded in Norway, Media Design in*Progress today has its >headquarters in San Diego, California. > > > >-- Terje | Media Design in*Progress > > C a s c a d e... a comprehensive Cascading Style Sheets editor for Mac > XPublish - for efficient website publishing with XML > Make your Web Site a Social Place with Interaction! > > Check out our web tools at > > > > >=========================================================================== >Unsubscribe: > >Help: >or > Gerard Freriks,huisarts, MD C. Sterrenburgstr 54 3151JG Hoek van Holland the Netherlands Telephone: (+31) (0)174-384296/ Fax: -386249 Mobile : (+31) (0)6-54792800 ARS LONGA, VITA BREVIS xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 8 02:24:47 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Status Report In-Reply-To: <199801072241.QAA03227@copsol.com> References: <199801050318.WAA01031@unready.microstar.com> <199801072241.QAA03227@copsol.com> Message-ID: <199801080206.VAA00363@unready.microstar.com> Alex Milowski writes: > By "core events", do you mean that only a subset of the APIs would > be available? Actually, since that posting, I have pretty much decided that I agree that we should have three event-handler interfaces instead of one: org.xml.sax.EntityHandler org.xml.sax.DocumentHandler org.xml.sax.ErrorHandler There is room for future interfaces to support authoring tools, repositories, and other tools that require access to non-structural lexical information like internal entity references and comments, but these are out of scope for the first round (right now we're looking, roughly, at an ESIS-level information set). We could also consider adding an interface for DTD events after we've finished this round. I don't know what finally happened with XAPI-J, but this is an independent effort. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 8 02:38:38 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Exceptions in Java SAX Implementation Message-ID: <199801080238.VAA00312@unready.microstar.com> ISSUE: LANGUAGE-SPECIFIC EXCEPTION HANDLING I think that it would be a bad idea for SAX to require the target language to support exceptions; as a result, I don't plan to require any of the SAX interfaces to throw specific exceptions under specific circumstances (though exceptions like java.io.IOException or java.net.MalFormedURLException may be thrown by code outside of SAX). I may, however, create an exception specifically for use with the error() implementation in the base class, for languages that support exceptions. More generally, SAX users for the Java version will almost certainly want to be able to pass their own exceptions (such as database access exceptions) through from their event handlers to the code that started the parser, and that means that exceptions will have to flow through the event handlers and the SAX parser more-or-less transparently. Does it make sense, then, simply to allow every method in the SAX/Java interfaces to throw java.lang.Exception? package org.xml.sax.DocumentHandler { public void startDocument () throws java.lang.Exception; public void endDocument () throws java.lang.Exception; /* etc. */ } The parser itself could intercept any exceptions that it knows about specifically, and simply pass through the rest to the caller. An invocation of a SAX parser in Java, then, would probably look something like this: try { parser.parse(null, "file://localhost/home/david/foo.xml"); } catch (IOException e) { /* do something */ } catch (MyException e) { /* do something else */ } catch (Exception e) { /* do yet another thing */ } We are simply passing through exceptions for languages that support them: languages without exceptions can simply omit this facility without damaging the functionality of SAX. Thanks to James Clark for raising this point a week or two ago. I would appreciate in particular hearing from Java and other OO design specialists on the merits (or otherwise) of this approach. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Jan 8 04:32:40 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:49 2004 Subject: Exceptions in Java SAX Implementation Message-ID: <000d01bd1bed$e153da50$2ee044c6@donpark> David, >Does it make sense, then, simply to allow every method in the SAX/Java >interfaces to throw java.lang.Exception? > > package org.xml.sax.DocumentHandler { > public void startDocument () throws java.lang.Exception; > public void endDocument () throws java.lang.Exception; > /* etc. */ > } Allowing every method in SAX interfaces to throw java.lang.Exception basically disables Java compile-time exception checking. I do not think this is a good idea. How about defining ApplicationException class which is basicaly an exception container? public void startDocument () throws ApplicationException { try { // some IO } catch (Exception ex) { throw new ApplicationException(ex); } } Don Park xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Jan 8 04:54:14 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:59:49 2004 Subject: SAX: Naming and Packaging (question 10 of 10) In-Reply-To: <3.0.1.16.19980107084444.358f5aa0@pop3.demon.co.uk> (message from Peter Murray-Rust on Wed, 07 Jan 1998 08:44:44) Message-ID: <199801080452.UAA07330@boethius.eng.sun.com> [Peter Murray-Rust:] | I am also really impressed by you foresight in thinking | about XML 6 years ago and this would be a fitting tribute. Er... I was referring to the address bosak@netcom.com as six years old, not the xml.org name. I snaffled that one about a year ago, and it wasn't my idea; someone sent me mail asking why we hadn't grabbed it yet. I was about three days too late to get xml.net as well, which ended up with some company in Asia that I believe has no idea what XML is. For the record, "XML" as the name for a subset of SGML designed for the Web was suggested by James Clark in a message to the old SGML ERB dated August 19, 1996. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 8 08:55:14 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:50 2004 Subject: SAX: Status Report In-Reply-To: <199801080206.VAA00363@unready.microstar.com> References: <199801072241.QAA03227@copsol.com> <199801050318.WAA01031@unready.microstar.com> <199801072241.QAA03227@copsol.com> Message-ID: <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> At 21:06 07/01/98 -0500, David Megginson wrote: [...] > >There is room for future interfaces to support authoring tools, >repositories, and other tools that require access to non-structural >lexical information like internal entity references and comments, but >these are out of scope for the first round (right now we're looking, >roughly, at an ESIS-level information set). We could also consider >adding an interface for DTD events after we've finished this round. "ESIS-level" certainly appeals to me, and I like the way that DavidM has allowed for immediate and future extensibility. You can do a lot with ESIS as Joe English showed with CoST. And it's not too difficult to build trees from such a level - this is what JUMBO does. I think some of the current discussion reflects what many of us feel is missing in ESIS [otherwise we should simply build parsers directly with ESIS output :-)] Can I assume that it is possible to create a full ESIS stream using SAX? If not, I'd be slightly worried and would like to know what had been omitted. [You can see I'm not an expert :-)]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 8 12:07:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:50 2004 Subject: SAX: Missing ESIS Information In-Reply-To: <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> References: <199801072241.QAA03227@copsol.com> <199801050318.WAA01031@unready.microstar.com> <199801080206.VAA00363@unready.microstar.com> <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> Message-ID: <199801081145.GAA00367@unready.microstar.com> Peter Murray-Rust writes: > Can I assume that it is possible to create a full ESIS stream using SAX? If > not, I'd be slightly worried and would like to know what had been omitted. > [You can see I'm not an expert :-)]. Here, off the top of my head, is the main XML information missing from SAX but available in the ESIS output of NSGMLS: - attribute types - public and system identifiers for notations - notation and public and system identifiers for external data entities There has been no agreement on whether this information belongs in SAX, and I welcome further discussion. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 8 13:22:03 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:50 2004 Subject: LISTRIVIA: multiple postings Message-ID: <3.0.1.16.19980108140129.2a0fcd1c@pop3.demon.co.uk> We believe that the XML-DEV mailing problem has been discovered - many thanks to the colleagues at IC who have fixed it. I am mailing this in the expectation that it is fixed, and if you do not get multiple postings, feel free to continue. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 8 14:05:55 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:50 2004 Subject: SAX: Missing ESIS Information In-Reply-To: <199801081145.GAA00367@unready.microstar.com> References: <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> <199801072241.QAA03227@copsol.com> <199801050318.WAA01031@unready.microstar.com> <199801080206.VAA00363@unready.microstar.com> <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980108144755.2f77d736@pop3.demon.co.uk> Thanks David, At 06:45 08/01/98 -0500, David Megginson wrote: >Peter Murray-Rust writes: > > > Can I assume that it is possible to create a full ESIS stream using SAX? If > > not, I'd be slightly worried and would like to know what had been omitted. > > [You can see I'm not an expert :-)]. > >Here, off the top of my head, is the main XML information missing from >SAX but available in the ESIS output of NSGMLS: > >- attribute types Tricky one, this. I can see the logic in keeping these in, especially where entities are involved. OTOH they are the sort of information that distinguishes simple WF docs (i.e. no internal/external subsets) from those with subsets. [BTW we need a terminology for these various types of documents...]. The WG has already realised the problem of ID by (I think in XSL) treating *attributeNames* of ID as special (since it recognises that not all authors may declare the *attributeType*). >- public and system identifiers for notations I can live without this. But others may not be able to do so :-) >- notation and public and system identifiers for external data entities This is part of what we've been discussing, right? > >There has been no agreement on whether this information belongs in >SAX, and I welcome further discussion. Suggest that you rephrase it as questions. e.g. Q11 "Should SAX be able to recreate an ESIS stream?" I'd want good evidence why. [One reason could be that in that way one can easily compare the output of parsers, and also compare it with nsgmls and other ESIS-aware tools.] P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 8 15:07:45 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:50 2004 Subject: SAX: Missing ESIS Information In-Reply-To: <3.0.1.16.19980108144755.2f77d736@pop3.demon.co.uk> References: <3.0.1.16.19980108095320.392f6c46@pop3.demon.co.uk> <199801072241.QAA03227@copsol.com> <199801050318.WAA01031@unready.microstar.com> <199801080206.VAA00363@unready.microstar.com> <199801081145.GAA00367@unready.microstar.com> <3.0.1.16.19980108144755.2f77d736@pop3.demon.co.uk> Message-ID: <199801081507.KAA00275@unready.microstar.com> Peter Murray-Rust writes: > Suggest that you rephrase it as questions. e.g. Q11 "Should SAX be > able to recreate an ESIS stream?" I'd want good evidence why. [One > reason could be that in that way one can easily compare the output > of parsers, and also compare it with nsgmls and other ESIS-aware > tools.] You would find such a comparison surprisingly difficult, because the order of attributes may vary and because different parsers will divide character data into chunks differently. For example, if you have text ]]> ?lfred will deliver one chunk of character data: "\n\n\n text\n\n\n" while another parser might deliver three: "\n\n" "\n text\n" "\n\n" The same applies to expanded entities, character references, etc. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Jan 8 19:59:28 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <3430FBD6.9E0E8471@infinet.com> I am somewhat of a newbie to XML (aren't most people I suppose) as an intermediary persistence framework for a distributed application I am writing and I am currently in the process of writing several DTD's. I looked at Microsoft's Channel Definition Format as well as many other DTD's and noticed that many people seem to embed what seemingly should be element content as a REQUIRED element attribute. In the case of CDF, most of the elements are EMPTY with one attribute named VALUE that is REQUIRED CDATA. I would think that in these cases an "author" tag should embed its content as follows Mr John Smith, rather than how Microsoft CDF embeds its content which is . Is this simply just a design preference, or else is there a concrete reason why what seemingly is content should be embedded as an attribute. If anyone could enlighten me as to what I should probably do here, then that would be greatly appreciated. Thanx in advance, Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Thu Jan 8 20:30:06 1998 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <3430FBD6.9E0E8471@infinet.com> Message-ID: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> At 09:17 AM 9/30/97 -0400, Tyler Baker wrote: >I would think that in these cases an "author" tag should embed its >content as follows Mr John Smith, rather than how >Microsoft CDF embeds its content which is />. I can't speak for Microsoft, but my guess is that they are simply using XML in the manner most analogous to objects in object oriented systems. In an object, the attributes are the data values: class Author { String name; // e.g. "Mr John Smith" }; Here they have used the name VALUE in a similar way: class Author { String VALUE; // e.g. "Mr John Smith" }; One very easy way to change objects into XML elements is to use one element for each object, and use attributes to model the data members: /> The reason attributes are better for this is that an object may have many data members, and these are distinguished by names. The element content only has one place to put things, and there is no name associated with it. >Is this simply just a design preference, or else is there a concrete >reason why what seemingly is content should be embedded as an attribute. To me, Microsoft's method makes sense if what you are doing is converting objects to XML, but your preferred method ("Mr John Smith") makes more sense if you are converting objects to XML and back. Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chris at surewould.com Thu Jan 8 20:55:15 1998 From: chris at surewould.com (Chris) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> Message-ID: <34B53D74.73F2@surewould.com> Also, it seems that using an tag without a VALUe attribute makes sense if Mr. Baker isn't converting back and forth to objects at all, but is just doing something simple with authors - which I have a sneaking suspicion is all he's trying to do. Jonathan Robie wrote: > > At 09:17 AM 9/30/97 -0400, Tyler Baker wrote: > > >I would think that in these cases an "author" tag should embed its > >content as follows Mr John Smith, rather than how > >Microsoft CDF embeds its content which is >/>. > > I can't speak for Microsoft, but my guess is that they are simply using XML > in the manner most analogous to objects in object oriented systems. In an > object, the attributes are the data values: > > class Author > { > String name; // e.g. "Mr John Smith" > }; > > Here they have used the name VALUE in a similar way: > > class Author > { > String VALUE; // e.g. "Mr John Smith" > }; > > One very easy way to change objects into XML elements is to use one element > for each object, and use attributes to model the data members: > > >/> > > The reason attributes are better for this is that an object may have many > data members, and these are distinguished by names. The element content > only has one place to put things, and there is no name associated with it. > > >Is this simply just a design preference, or else is there a concrete > >reason why what seemingly is content should be embedded as an attribute. > > To me, Microsoft's method makes sense if what you are doing is converting > objects to XML, but your preferred method ("Mr John > Smith") makes more sense if you are converting objects to XML and > back. > > Jonathan > > > jonathan@texcel.no > Texcel Research > http://www.texcel.no > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeanpa at microsoft.com Thu Jan 8 21:21:36 1998 From: jeanpa at microsoft.com (Jean Paoli) Date: Mon Jun 7 16:59:50 2004 Subject: Technology preview of the Microsoft XSL Processor Message-ID: <78DFE33066ABD0118B9200805FD431BA01B3F4E6@red-16-msg.dns.microsoft.com> Dear all, I am pleased to announce a technology preview of the Microsoft XSL Processor based on the "Proposal for XSL" (jointly submitted to the W3C in September by Microsoft, Inso Corporation, and Arbortext). This early release is intended to allow for prototyping and validation of the ideas described in the Proposal for XSL, and to share our early implementation experience with the Web community. The Microsoft XSL Processor transforms XML-based data into HTML and CSS using an XSL stylesheet, and implements many of the features described in the "Proposal for XSL". The XSL Processor is available in two packages: * The Microsoft XSL ActiveX Control uses an XSL stylesheet to display XML data directly within web pages and applications. * The Microsoft XSL Command-line Utility produces an HTML document from an XML document and an XSL stylesheet. The Microsoft XSL Processor can be found at http://www.microsoft.com/xml/xsl. This newly launched XSL portion of the XML web site features: * Downloads and instructions for the two Microsoft XSL Processor packages. * Sample stylesheets and live demos. * An XSL Tutorial, describing the concepts behind XSL and the available features of the Microsoft XSL Processor. * Links to the Proposal for XSL, and other XSL information. Note: The above URL is located within the XML area of the Microsoft Web site, which has also been expanded and moved to http://www.microsoft.com/xml. Jean Paoli xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Jan 8 21:47:29 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3430FBD6.9E0E8471@infinet.com> Message-ID: <34B5494F.3E27B071@allette.com.au> Tyler Baker wrote: > I would think that in these cases an "author" tag should embed its content as > follows Mr John Smith, rather than how Microsoft CDF embeds > its content which is . > > Is this simply just a design preference, or else is there a concrete reason why > what seemingly is content should be embedded as an attribute. I regard that as design preference. An attribute obtained from a list of valid tokens will provide some semantic control over the contents, but in the case of "author" this would be unlikely as the maintenance on the DTD would probably be prohibitive. If you're interested in more information about philosophical approaches to design, I would be inclined to check out some of the SGML publications. I don't believe the statement "SGML people have longer been involved with design, whereas XML-dev people are more involved with data use and applications" will be deemed inflammatory. Just keep in mind the differences between SGML and XML, the particular vagaries of the author, the nature of the examples being discussed and then toss it all and go with your instincts. That seems to be the prevailing scientific approach. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Thu Jan 8 21:48:02 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <98Jan8.155349est.18817@thicket.arbortext.com> References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> Message-ID: <3.0.5.32.19980108164847.00a55100@village.doctools.com> I wonder if CDF is designed this way simply to make the channel metadata be invisible in browsers. Existing browsers ignore unknown tags, display element content, and don't display attribute values. Eve At 03:56 PM 1/8/98 -0500, Chris wrote: >Also, it seems that using an tag without a VALUe attribute >makes sense if Mr. Baker isn't converting back and forth to objects at >all, but is just doing something simple with authors - which I have a >sneaking suspicion is all he's trying to do. > >Jonathan Robie wrote: >> >> At 09:17 AM 9/30/97 -0400, Tyler Baker wrote: >> >> >I would think that in these cases an "author" tag should embed its >> >content as follows Mr John Smith, rather than how >> >Microsoft CDF embeds its content which is > >/>. >> >> I can't speak for Microsoft, but my guess is that they are simply using XML >> in the manner most analogous to objects in object oriented systems. In an >> object, the attributes are the data values: >> >> class Author >> { >> String name; // e.g. "Mr John Smith" >> }; >> >> Here they have used the name VALUE in a similar way: >> >> class Author >> { >> String VALUE; // e.g. "Mr John Smith" >> }; >> >> One very easy way to change objects into XML elements is to use one element >> for each object, and use attributes to model the data members: >> >> > >/> >> >> The reason attributes are better for this is that an object may have many >> data members, and these are distinguished by names. The element content >> only has one place to put things, and there is no name associated with it. >> >> >Is this simply just a design preference, or else is there a concrete >> >reason why what seemingly is content should be embedded as an attribute. >> >> To me, Microsoft's method makes sense if what you are doing is converting >> objects to XML, but your preferred method ("Mr John >> Smith") makes more sense if you are converting objects to XML and >> back. >> >> Jonathan >> >> >> jonathan@texcel.no >> Texcel Research >> http://www.texcel.no >> >> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >> (un)subscribe xml-dev >> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >> subscribe xml-dev-digest >> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Jan 8 21:50:29 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> <34B53D74.73F2@surewould.com> Message-ID: <34311570.326F9D07@infinet.com> Chris wrote: > Also, it seems that using an tag without a VALUe attribute > makes sense if Mr. Baker isn't converting back and forth to objects at > all, but is just doing something simple with authors - which I have a > sneaking suspicion is all he's trying to do. > Actually, for now I am testing two DTD's. One is just a simple container for user info namely NickName FirstName LastName Email PostMail Phone HomePage The other is a DTD for storing the host name and port number of a currently running server (if the CORBA IOR is not available), in addition to a CORBA IOR which points to the actual live server (which for now is based on CORBA). It also has the host name and port number for where the actual COSS Event Channel lives for the app as well as its IOR too. Several different transport layers may be used in the future (e.g. Voyager, RMI, DCOM, Sockets) so I plan on having a different DTD for each transport layer. Anyways, I will be using this second DTD in a manner which is similiar to CDF, where I may have a repository of these XML files which point to these live applications. The repository more than likely will be run on a web server. After getting used to the syntax of XML for defining a DTD, I am a little perplexed about an element declaration of the form: vs. Apparently in the first example element Foo can have 0 or more bar1, bar2, or bar3 attributes but only one of these, excluding the other, and the second example element says that you have 0 or more sequences of bar1, bar2, and bar3 attributes. My confusion is that in the DTD's I have seen so far, the first example element is used as if attributes bar1, bar2, and bar3 can all exist together or else as a combination of two, or else singularly. But in EBNF notation the ' | ' as far as I know means one or the other and not both. Last but not least this example element seems to mean the same thing as the second example element. I am sorry to be posting this "please help me" post to an xml-dev list (which I assume is mostly for parser writing discussion), but I have tried in the short time I have been on this list I have tried to post pointers to parser writers that help them in their quest for optimal importance so I don't feel too guilty (-: Any info on this would be greatly appreciated as I cannot find any FAQ which explains this in detail other than the current XML spec which IMHO has a lot of ambiguities that are not clearly explained and therefore relatively confusing to anyone who does not have extensive experience in the SGML camp. Thanx in advance, Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 8 21:53:02 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <3.0.32.19980108135238.009d37c0@pop.intergate.bc.ca> At 09:17 AM 30/09/97 -0400, Tyler Baker wrote: >...many people seem to embed what seemingly should >be element content as a REQUIRED element attribute. ... >Is this simply just a design preference, or else is there a concrete >reason why what seemingly is content should be embedded as an attribute. There is no automated decision procedure as to what should be an attribute and what an element. There are some things you can do with attributes but not with elements, and vice versa. But there are lots of places where either works. In these places, it is indeed, as you hypothesize, a design preference. Human document designers empirically seem to like having both elements & attributes available and find this increases their expressive power. There is a clear lesson; any software that needs to be able to fish data out of an XML document had better have the capability of extracting it either from an element or an attribute. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 8 21:57:52 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <3.0.32.19980108135910.009cf100@pop.intergate.bc.ca> At 11:06 AM 30/09/97 -0400, Tyler Baker wrote: > I am a little perplexed >about an element declaration of the form: > > any number of bar1, bar2, and bar3 elements in any order one child element, either bar1, bar2, or bar3 > 0 or more bar1, bar2, bar3 sequences. > any of: bar1 bar2 bar3 bar1 bar2 bar1 bar3 bar2 bar3 bar1 bar2 bar3 I agree that the semantic of '|' is not as it is in some other systems. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Thu Jan 8 22:08:08 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <34311570.326F9D07@infinet.com> (message from Tyler Baker on Tue, 30 Sep 1997 11:06:24 -0400) Message-ID: <199801082213.RAA15775@geode.ora.com> [Tyler Baker] > After getting used to the syntax of XML for defining a DTD, I am a > little perplexed about an element declaration of the form: > > > > vs. > > > > Apparently in the first example element Foo can have 0 or more bar1, > bar2, or bar3 attributes but only one of these, excluding the other, > and the second example element says that you have 0 or more > sequences of bar1, bar2, and bar3 attributes. My confusion is that > in the DTD's I have seen so far, the first example element is used > as if attributes bar1, bar2, and bar3 can all exist together or else > as a combination of two, or else singularly. But in EBNF notation > the ' | ' as far as I know means one or the other and not both. You are confused about many things. The | in EBNF means what you think. DTDs are not EBNF (note the lack of '::='). The first example means zero or more of (bar1 | bar2 | bar3) - each of that "zero or more" must be only one of those, but there is no restriction that it be the same bar every time. The model you describe would be (bar1* | bar2* | bar3*). > > > Last but not least this example element seems to mean the same thing > as the second example element. The asterisk in (bar1, bar2, bar3)* means zero or more sequences of bar1, bar2, bar3. The third element declaration's (bar1?, bar2?, bar3?), means an optional bar1 (zero or one), followed by an optional bar2, followed by an optional bar3. There may not be another bar1 after bar3, as there may be in the second element declaration. > I am sorry to be posting this "please help me" post to an xml-dev > list (which I assume is mostly for parser writing discussion), but I > have tried in the short time I have been on this list I have tried > to post pointers to parser writers that help them in their quest for > optimal importance so I don't feel too guilty (-: The best place for questions like this is probably comp.text.sgml. I also recommend _Practical SGML_ by Erik van Herwijnen, Kluwer Academic, ISBN 0792394348. Of the XML books thus far published, _Presenting XML_ has some discussion of content models that hasn't been obsoleted; _XML Complete_ has some discussion if you can pick amongst the Java programming examples that comprise most of the book. > Any info on this would be greatly appreciated as I cannot find any > FAQ which explains this in detail other than the current XML spec > which IMHO has a lot of ambiguities that are not clearly explained > and therefore relatively confusing to anyone who does not have > extensive experience in the SGML camp. is the XML FAQ, maintained by Peter Flynn. Ambiguities should probably be pointed out to the editors. There are few or no ambiguities in the Platonic spec, in the heads of the Working Group, but that Platonic ideal is not captured perfectly in the published spec. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cmanning at sultry.arts.usyd.edu.au Thu Jan 8 22:46:59 1998 From: cmanning at sultry.arts.usyd.edu.au (Christopher D. Manning) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <3.0.32.19980108135910.009cf100@pop.intergate.bc.ca> References: <3.0.32.19980108135910.009cf100@pop.intergate.bc.ca> Message-ID: <199801082243.JAA23574@coogee.arts.usyd.edu.au> On 8 January 1998, Tim Bray wrote: > > > > any number of bar1, bar2, and bar3 elements in any order > > I agree that the semantic of '|' is not as it is in some other > systems. -T. I disagree. The | _is_ meaning bar1 or bar2 or bar3 -- one only, but it scopes inside the Kleene star operator, which means any number of repetitions of the stuff before it, which therefore gives the semantics above. Chris Manning xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msuzio at ford.com Thu Jan 8 23:03:06 1998 From: msuzio at ford.com (Michael J. Suzio) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> <34B53D74.73F2@surewould.com> <34311570.326F9D07@infinet.com> Message-ID: <199801082302.AA25426@mailfw1.ford.com> Tyler Baker wrote: > Any info on this would be greatly appreciated as I cannot find any > FAQ which explains this in detail other than the current XML > spec which IMHO has a lot of ambiguities that are not > clearly explained and therefore relatively confusing to anyone > who does not have extensive experience in the SGML camp. I couldn't agree more, I just poured over the XML spec for two hours, taking notes and trying like hell to understand this stuff. I poked around trying to find a "Gentle Intro to XML" or "A Complete Idiot's Guide to DTD Writing", but so far, no success. Any recommendations from the list would be great. So, is it acceptable to post here asking for clarifications on these matters? I've read the spec, read the FAQ, I'm still somewhat lost on these things (but learning!). BTW, development of SAX would be great... I'm writing XML application code right now, and I'd like to be able to plug in any parser I want -- SAX would help a lot in letting me write to an API, and not caring what's 'under the hood'. Let us lowly peons know how we can help! -- Michael J. Suzio Web Technical Standards, WWW & Internet Applications (313) 24-88120 msuzio@eccms1.dearborn.ford.com / msuzio@ford.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Jan 9 00:46:40 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> <34B53D74.73F2@surewould.com> <34311570.326F9D07@infinet.com> <199801082302.AA25426@mailfw1.ford.com> Message-ID: <34B5734F.19AF736@allette.com.au> Michael J. Suzio wrote: > I couldn't agree more, I just poured over the XML spec for two hours, taking > notes and trying like hell to understand this stuff. I poked around trying to > find a "Gentle Intro to XML" or "A Complete Idiot's Guide to DTD Writing", but > so far, no success. Any recommendations from the list would be great. Without wishing to sound like a broken record, I think that SGML publications are the place to start. I personally believe that for better or for worse, an good grounding in SGML gives you a very different perspective of XML. The XML standard was purposefully kept brief, in keeping with the intended spirit of implementations. This is a sensible and realistic approach, though it may leave you feeling as though you're eating soup with a fork. :-) > So, is it acceptable to post here asking for clarifications on these matters? > I've read the spec, read the FAQ, I'm still somewhat lost on these things (but > learning!). It's difficult to say that this isn't the appropriate place, as even now there are legitimate (albeit infrequent) points raised about one aspect or another of XML, but if you strongly suspect that your question may be the result of your lack of understanding rather than a technical hitch, you would probably be better off either researching it further or taking it to comp.text.sgml. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Fri Jan 9 01:43:29 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 16:59:50 2004 Subject: A bug in MSXML in Java 1.8 Message-ID: <9801090143.AA03144@lute.apsdc.ksp.fujixerox.co.jp> The following paragraph is quoted from "Microsoft XML Parser in Java Release Notes for Version 1.8" (http://www.microsoft.com/xml/parser/xmlchgs.htm) >Section 2.12 adds a new xml:lang attribute. This means that >any element can now have this attribute regardless of ATTLIST >declaration. For example, the following is valid, even though >the DTD says that the test element doesn't have any attributes. > > > ]> > > The quick brown fox. > This is clearly incorrect. The XML PR very clearly requires that this attribute must be declared for valid documents. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 01:45:51 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:50 2004 Subject: SAX in Java: Exceptions, again (please respond) Message-ID: <199801090145.UAA00426@unready.microstar.com> I'm still working to resolve the issue of exceptions in the Java implementation of SAX, before I put out a full draft implementation for discussion (the draft may be delayed a bit by the ice storms and power outages here in Ottawa). This is an important, but also an extremely technical issue, so perhaps a simpler explanation of the problem will help bring more people into the discussion (thanks to those who have responded already). A SAX application will a structure like this: User code (top-level) <==> SAX layer <==> User code (callbacks) In other words, the top-level user code invokes the SAX-conformant parser, which in turn invokes more user code through the callback interface. The user code in the callbacks cannot throw general exceptions to the user code at the top-level unless explicitly allowed by the SAX layer. There are three possible solutions: 1) Allow all callbacks to throw java.lang.Exception (i.e. any exception), and require SAX-conformant Java XML parsers to pass through any exceptions that are not specific to them. 2a) Allow all callbacks to throw only a special SAX exception, which can act as a container to carry other exceptions through to the top level. 2b) Don't explicitly allow any callbacks to throw exceptions; callbacks can get exceptions to the top level only by creating their own container based on the java.lang.RunTime exception, and hiding other exceptions in it. FIRST SOLUTION -------------- The first solution is the most transparent, because the callbacks can simply throw exceptions as usual, and the top-level code can catch it. To throw an instance of MyException, for example, the startDocument callback could simply use public void startDocument () throws MyException { if (/* problem */) { throw new MyException("oh damn!"); } } At the top level, you could catch the exception simply, like this: try { parser.parse(null, "file://localhost/tmp/mydoc.xml"); } catch (MyException e) { /* do something */ } catch (Exception e) { /* any other exception */ } Note that even though the interface has public void startDocument () throws java.lang.Exception; the implementation can be more specific, and limit the actual exceptions thrown (or throw none at all, if desired). This will cause no problems for application writers (since they can be specific), but it will make life harder for parser writers, because nearly every method in the parser will end up throwing java.lang.Exception, and compile-time error checking will be much weaker. SECOND SOLUTIONS (A and B) -------------------------- The other two solutions both require embedding, using either a regular exception or a runtime exception. First, we need a special exception: public class SAXException extends java.lang.Exception { private java.lang.Exception realException; public SAXException (java.lang.Exception e) { realException = e; } public java.lang.Exception getException () { return realException; } } Now, throwing an exception is a little more indirect: public void startDocument () throws SAXException { if (/* problem */) { throw new SAXException(new MyException("oh damn!")); } } Catching it is also a little trickier: try { parser.parse(null, "file://localhost/tmp/mydoc.xml"); } catch (SAXException e) { java.lang.Exception realException = e.getException(); if (realException instanceof MyException) { /* do something */ } else { /* do something else */ } } You can also rethrow it at this point: try { parser.parse(null, "file://localhost/tmp/mydoc.xml"); } catch (SAXException e) { throw e.getException(); } This is going to make life harder for application writers, but parser writers will be able to rely on stronger compile-time error checking, since their internal methods will have to throw only org.sax.SAXException (or whatever) instead of java.lang.Exception. Which do we choose? Someone's going to have a harder time, and we have to choose between the parser writers and the application writers. I have already written Ælfred so that all callbacks can throw java.lang.Exception and have it passed transparently through to the top-level code, but Ælfred will also allow the more specific solution if we choose it. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 02:21:07 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:50 2004 Subject: SAX/Java: Exceptions, Again Message-ID: <199801090220.VAA01181@unready.microstar.com> I think that I've just answered my own question about Exception handling in the Java implementation of SAX. There is no reason that the SAX frontend for each parser cannot pack exceptions from the callbacks into a container and unpack them for the top-level transparently -- that way, the parser can still have tight compile-time error checking, but application writers won't have to jump through hoops to throw exceptions to the top level. Unless I read a good argument to the contrary, then, the interface will look like this: public void startDocument () throws java.lang.Exception; public void endDocument () throws java.lang.Exception; public void characters (char ch[], int start, int end) throws java.lang.Exception; /* etc. */ Implementations can use much stricter type checking themselves, and are not required to throw any exceptions at all -- the Java interface just gives the boundary (in this case, any or no exceptions). I still need to define what a SAX parser is allowed to catch and what it may pass through. Here's a rough sketch: 1) A parser must catch all of its own, internal exceptions (i.e. no SAX parser should throw an exception that others do not -- this can be managed in the SAX frontend if necessary). 2) A parser may catch any exceptions derived from java.io.IOException (that includes the networking exceptions), but only if it is capable of resolving or working around the problem corresponding to the exception; otherwise, it must throw it on through to the top-level user code. 3) A parser must pass all other exceptions up to the user code. What am I missing in this list? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at interlog.com Fri Jan 9 02:41:36 1998 From: smith at interlog.com (Chris Smith) Date: Mon Jun 7 16:59:50 2004 Subject: XML and Using It With Whitespace In-Reply-To: <3.0.32.19980106185404.008e2ad0@pop.intergate.bc.ca> Message-ID: On Tue, 6 Jan 1998, Tim Bray wrote: > Date: Tue, 06 Jan 1998 18:56:52 -0800 > From: Tim Bray > > Hmm, I'm failing to get some aspect of your problem. Maybe I > bypassed a message in which you explained it. It is clear that > *any* conformant parser must give you all the whitespace in the > message. The difficulty was not with insignificant whitespace, it was with unwanted whitespace. For example, (old ground, I know, I'm sorry) if a long line of text gets broken by changing a space to a lineend, then message authentication will fail. XML has done an admirable job with whitespace in element content, but we were looking at a different problem. We wanted to try (where possible) to rescue transport mechanisms, such as email, that occasionally damage their content. > Probably I'm missing something... what is the missing piece from > your point of view? -T. We were trying to save the world :-). Notice I'm using the past tense? In our conference call this morning, we finally decided we'd had enough. Transport mechanisms that damage their content are broken. Period. Fix the transport. As a result, we have taken a different approach, one that draws rather heavily on watching the parser work going on here. It essentially is: - parse the document, using the DTD. - generate a clean XML document (in UTF-16) from the parsed version - run the authentication check on that I'll post the definitive section of our spec in my next message. Because the authentication is carried *inside* the XML document, the whole document doesn't actually get authenticated. We authenticate any one element (except, I suppose, the document element) which includes an authentication of all content. You can include multiple authentications, which may nest. There are other complications, but I'd need to put in the whole spec. Considering that XML essentially rescued our group from an encoding stalemate, I'll post the press release (which I understand is coming out on Monday.) --------------------------------------------------------------------------- Chris Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at interlog.com Fri Jan 9 02:45:22 1998 From: smith at interlog.com (Chris Smith) Date: Mon Jun 7 16:59:50 2004 Subject: Canonical Encoding for XML Elements Message-ID: Here, as mentioned, is our process for creating a canonical form of XML elements. Comments are welcome. In particular, do parsers keep CDATA sections distinct from character data? ------------------------------------------- Canonical Encoding Format for XML The canonical format of an XML element is created by firstly deriving the logical content and structure of the underlying XML document by parsing it, and then generating the canonical physical form of the element based on the logical structure using the process defined below. For the XML element being generated or any of its child elements: * convert all characters in the element to [UTF16] format1. * apply all external entities and all character and entity references in the element so that they are completely resolved * exclude comments and processing instructions (PIs), * reduce all attributes to their canonical form using the attribute type in the DTD. Replace all single and double quotes present in attributes with ' and " respectively so that attributes can be enclosed in double quotes * create attributes, using their default value, which are not present in the original but have default values in the DTD * sort the original and generated attributes in ascending attribute name order according to the UTF-16 encoding of the attribute name (i.e. not the native character ordering) * for whitespace inside markup but not inside attribute values, generate it as minimally as possible. Specifically: - remove non essential whitespace, and - represent required whitespace by a single space character * generate the content of all start tags using only the element name and the attributes as described above. If the element is an "empty" element then generate it using the single empty tag format, with a trailing slash. Generate end tags using only the element name, with no added whitespace. * remove all whitespace in the element content * keep CDATA sections as CDATA sections. Also: - do not convert CDATA sections to character data with character references - convert all occurrences of the right angle bracket ">" to > * character data that is not in CDATA sections must have all occurrences of "<", ">", and "&" converted to < > and & respectively. * start tags, end tags, empty tags, CDATA sections, and text sections are assembled in the same order as the original document. --------------------------------------------------------------------------- Chris Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davido at pragmaticainc.com Fri Jan 9 02:47:33 1998 From: davido at pragmaticainc.com (David Ornstein) Date: Mon Jun 7 16:59:50 2004 Subject: SAX/Java: Exceptions, Again In-Reply-To: <199801090220.VAA01181@unready.microstar.com> Message-ID: <02451038020876@pragmaticainc.com> David Megginson wrote [09:20 PM 1/8/98 -0500]: >I think that I've just answered my own question about Exception >handling in the Java implementation of SAX. There is no reason that >the SAX frontend for each parser cannot pack exceptions from the >callbacks into a container and unpack them for the top-level >transparently -- that way, the parser can still have tight >compile-time error checking, but application writers won't have to >jump through hoops to throw exceptions to the top level. [clip] >I still need to define what a SAX parser is allowed to catch and what >it may pass through. Here's a rough sketch: > > >1) A parser must catch all of its own, internal exceptions (i.e. no > SAX parser should throw an exception that others do not -- this can > be managed in the SAX frontend if necessary). > >2) A parser may catch any exceptions derived from > java.io.IOException (that includes the networking exceptions), but > only if it is capable of resolving or working around the problem > corresponding to the exception; otherwise, it must throw it on > through to the top-level user code. > >3) A parser must pass all other exceptions up to the user code. I like this a lot. One question: If the parser is unable to open a file (let's say) I'm assuming that this will cause a java.io.IOException to be thrown and in a fully Java system, the parser might well ignore this and allow the application to catch it. This means that there's no need to have any kind of return code coming out of the Parse() function in the parser interface. In thinking about this for other languages, for languages that support exceptions, we're mandating that the SAX implementations in those languages use exceptions also (since there's nowhere for the return code and the return codes are no specified as part of SAX). And in languages that don't support exceptions, I'm at a loss to say what we'd do. I'm have a funny feeling I'm missing something here, so please help me out if you see what I don't. Assuming that my reasoning is right, I'd propose that we agree on return-codes for the most common situations (yeah, I know that may be hard to nail down). David ================================ David Ornstein Pragmatica, Inc. http://www.pragmaticainc.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rwaldin at pacbell.net Fri Jan 9 02:50:04 1998 From: rwaldin at pacbell.net (Ray Waldin) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> <34B53D74.73F2@surewould.com> <34311570.326F9D07@infinet.com> <199801082302.AA25426@mailfw1.ford.com> <34B5734F.19AF736@allette.com.au> Message-ID: <34B58FF8.4A1A4379@pacbell.net> Tyler Baker: > ...is there a concrete > reason why what seemingly is content should be embedded as an attribute. First, let me say thank you to all you folks for your help -- I had the very same question only a few weeks ago and received quite a variety of responses. Marcus Carr: > >Michael J. Suzio: >> I poked around trying to >> find a "Gentle Intro to XML" or "A Complete Idiot's Guide to DTD Writing", but >> so far, no success. Any recommendations from the list would be great. > > Without wishing to sound like a broken record, I think that SGML publications are > the place to start. I personally believe that for better or for worse, an good > grounding in SGML gives you a very different perspective of XML. The XML standard > was purposefully kept brief, in keeping with the intended spirit of > implementations. This is a sensible and realistic approach, though it may leave > you feeling as though you're eating soup with a fork. :-) Of all the answers I've received, the ones referring me to SGML publications were the most intimidating. If the intention is to produce a "simple dialect of SGML" for its "ease of implementation", having to retrace the historical minutiae of SGML in order to create an XML DTD seems to defeat that purpose. It seems to me that referring XML newbies (like me) to SGML publications is like asking us to eat soup with a *BULLDOZER*. :) Starve or drown. What we need is an XML spoon to properly devour this feast. Anyhow, thanks for letting me slurp! :) -Ray (another hungry peon) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Jan 9 04:08:38 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:59:50 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <3.0.3.32.19980108152945.00cd7720@pop.mindspring.com> <34B53D74.73F2@surewould.com> <34311570.326F9D07@infinet.com> <199801082302.AA25426@mailfw1.ford.com> <34B5734F.19AF736@allette.com.au> <34B58FF8.4A1A4379@pacbell.net> Message-ID: <34B5A2A5.38E3530E@allette.com.au> Ray Waldin wrote: > Of all the answers I've received, the ones referring me to SGML publications were the > most intimidating. If the intention is to > produce a "simple dialect of SGML" for its "ease of implementation", having to > retrace the historical minutiae of SGML in order to create an XML DTD seems to defeat > that purpose. The question was related to a philosophical issue that has been hashed out by SGML people for ten years now, and that applies directly to XML as well. You need not know all of the details - some books, such as Chet Ensign's 'SGML: The Billion Dollar Secret' discusses SGML from a high-level perspective. Nobody's going to make you memorise the white-space rules around included elements.... :-) > It seems to me that referring XML newbies (like me) to SGML publications is like > asking us to eat soup with a *BULLDOZER*. :) Don't be scared off - not everything to do with SGML is rocket science. Much of it is just XML by another name. > Starve or drown. What we need is an XML spoon to properly devour this feast. > Anyhow, thanks for letting me slurp! :) Sorry Goldilocks, such a beast takes time to make. Until then, you'll either have to make do with Baby Bear's or Papa Bear's spoon. I suggest Papa Bear's, but only fill it up half way. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Jan 9 06:15:14 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:51 2004 Subject: SAX/Java: Exceptions, Again References: <199801090220.VAA01181@unready.microstar.com> Message-ID: <34B59C6F.C842EA94@jclark.com> David Megginson wrote: > I think that I've just answered my own question about Exception > handling in the Java implementation of SAX. There is no reason that > the SAX frontend for each parser cannot pack exceptions from the > callbacks into a container and unpack them for the top-level > transparently -- that way, the parser can still have tight > compile-time error checking, but application writers won't have to > jump through hoops to throw exceptions to the top level. The trouble with that approach is that that XmlProcessor.run will have to be declared as throwing Exception, which is horrible and will make things ugly when application writers call run. I want XmlProcessor.run to be declared as throwing java.io.IOException (there's no way it can be more restrictive that this). There's no perfect solution, but I think it would be worth considering restricting callbacks to throwing java.io.IOException. This would allow run can be declared as throwing java.io.IOException. This isn't much of a burden on application callbacks: IOException is fairly broad in Java, and the callbacks are processing input so it isn't unreasonable to restrict the exceptions they throw to IOException. Also this wouldn't require parser writers to catch and repack exceptions thrown by application callbacks. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 11:53:17 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:51 2004 Subject: Canonical Encoding for XML Elements In-Reply-To: References: Message-ID: <199801091148.GAA00436@unready.microstar.com> Chris Smith writes: > In particular, do parsers keep CDATA sections distinct from > character data? CDATA sections are part of the document's physical representation rather than of its logical structure, so they would likely be reported only by a specialised parser designed for authoring tools or repositories. For other purposes, it doesn't matter; the following two are exactly equivalent: text ]]> <sample>text</sample> Switching between the two should produce exactly the same rendered output from a formatting engine, exactly the same entries in a database, etc. etc. SAX would report both as start element: example characters: "text" end element: example (some might break the second event into several smaller ones). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 11:54:16 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:51 2004 Subject: SAX/Java: Exceptions, Again In-Reply-To: <02451038020876@pragmaticainc.com> References: <199801090220.VAA01181@unready.microstar.com> <02451038020876@pragmaticainc.com> Message-ID: <199801091141.GAA00401@unready.microstar.com> David Ornstein writes: > If the parser is unable to open a file (let's say) I'm assuming that this > will cause a java.io.IOException to be thrown and in a fully Java system, > the parser might well ignore this and allow the application to catch it. > This means that there's no need to have any kind of return code coming out > of the Parse() function in the parser interface. In thinking about this > for other languages, for languages that support exceptions, we're mandating > that the SAX implementations in those languages use exceptions also (since > there's nowhere for the return code and the return codes are no specified > as part of SAX). And in languages that don't support exceptions, I'm at a > loss to say what we'd do. For those languages, the parser would invoke the warning() or fatal() callbacks to report an IO problem, and your handlers could set a status variable. For all SAX implementations, any invocation of fatal() means that your document is probably corrupt, is not guaranteed complete, and should not be processed (except for error-reporting purposes). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 11:54:25 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:51 2004 Subject: SAX/Java: Exceptions, Again In-Reply-To: <34B59C6F.C842EA94@jclark.com> References: <199801090220.VAA01181@unready.microstar.com> <34B59C6F.C842EA94@jclark.com> Message-ID: <199801091136.GAA00372@unready.microstar.com> James Clark writes: > The trouble with that approach is that that XmlProcessor.run will have > to be declared as throwing Exception, which is horrible and will make > things ugly when application writers call run. I want XmlProcessor.run > to be declared as throwing java.io.IOException (there's no way it can be > more restrictive that this). Thank you again for the comments, James, and thank you for raising the issue of exception handling in SAX in the first place. I agree that at least java.io.IOException has to be passed through, but I think that there are some other ones, not derived from java.io.IOException, that will be nearly as common. For example, an application writer could reasonably want to catch exceptions derived from any of the following at the top level rather than in the callbacks: java.awt.AWTException java.beans.IntrospectionException java.sql.SQLException The last will be especially common, since I expect that SAX and the JDBC will become very intimate friends. >From a pragmatic perspective, the application writer will usually know what exceptions (if any) her callbacks actually throw, and I am allowing a parser (or at least, its SAX front end) to throw only java.io.IOException, so the try-catch stuff around the parser invocation should be fairly manageable -- this example, the user knows that the callbacks throw MyException1 and MyException2 (the parser may throw IOException as well): try { parser.parse(null, docUrl); } catch (MyException1 e) { /* do something */ } catch (MyException2 e) { /* do something else */ } catch (java.io.IOException e) { /* general */ } catch (java.lang.Exception e) { // should never happen throw new Error("Unexpected exception! " + e.getMessage()); } Finally, I don't want to rule out exceptions from important new Java libraries over the next few years. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Jan 9 13:57:29 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> >Ray Waldin wrote: >> Of all the answers I've received, the ones referring me to SGML publications were the >> most intimidating Marcus Carr replied: >The question was related to a philosophical issue that has been hashed out by SGML >people for ten years now, and that applies directly to XML as well. I think Marcus is wrong. The domain of application of SGML is different from the domain of application of XML, and the distinction between attributes and content which made sense in the SGML world is extremely perplexing to those with a background in data modelling and data structure design in other domains, who are legitimate members of the XML community. Philosophically - at least in terms of any ontological system I am aware of - it is a nonsense, and can be justified only in terms of pragmatic assumptions about information in the form of paper documents. We have a very non-orthogonal design where (as someone pointed out) you need content for some things, you need attributes for others, in some cases you can use either, and in some cases neither does the job very well (e.g. storing a date). I will resign myself to accepting XML as it is, but to suggest that its deficiencies are there because SGML gurus decided ten years ago that they were a good thing is unhelpful and not particularly flattering to the SGML gurus, who designed it that way for a different purpose. In the DTD I've been designing, for what it's worth, I'm currently using content for nearly everything, with very little use of attributes. The main reason is for future extensibility; elements can always acquire a richer internal structure, while attributes can't. The drawbacks (e.g. inability to specify any constraints on values, default values, etc) don't actually lose me much, because the constraints available for attributes are very limited anyway. Mike Kay, ICL M.H.Kay@eng.icl.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Fri Jan 9 15:06:56 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <34B58FF8.4A1A4379@pacbell.net> (message from Ray Waldin on Thu, 08 Jan 1998 18:48:24 -0800) Message-ID: <199801091511.KAA24346@geode.ora.com> [Ray Waldin] > Of all the answers I've received, the ones referring me to SGML > publications were the most intimidating. If the intention is to > produce a "simple dialect of SGML" for its "ease of implementation", > having to retrace the historical minutiae of SGML in order to create > an XML DTD seems to defeat that purpose. It seems to me that > referring XML newbies (like me) to SGML publications is like asking > us to eat soup with a *BULLDOZER*. :) If there were XML publications, we'd refer you to them. As far as I know, I have *all* of them. _Presenting XML_'s first half is a good philosophical overview of the "why"s and "how"s of XML, but its second half, as well as most of _XML Complete_, is already obsolete because the specification is still changing. (Yes, I'm biased. By choosing to wait until the specs have stabilized to publish my book, I'll probably lose a certain amount of market share. But I'll be able to refer people to my book for technical answers with a clear conscience.) -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Fri Jan 9 15:09:44 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> (M.H.Kay@eng.icl.co.uk) Message-ID: <199801091514.KAA24361@geode.ora.com> [Michael Kay] > I think Marcus is wrong. The domain of application of SGML is > different from the domain of application of XML, Only accidentally. The charter of XML is to provide a way to communicate SGML over the Web. XML is designed for documents. That it is applicable to data modeling is a happy convenience, and should not be considered a restriction on the language. > and the distinction between attributes and content which made sense > in the SGML world is extremely perplexing to those with a background > in data modelling and data structure design in other domains, who > are legitimate members of the XML community. Only accidentally. XML is for *documents*, where the distinction makes a whole lot of sense. And it *does* make sense for some kinds of data modeling: if the datum has internal structure, use subelements or mixed content; if it's a quantum, use an attribute. If you're not sure what it will be, use mixed content for flexibility. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Jan 9 15:18:14 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:51 2004 Subject: msxsl Message-ID: <199801091518.PAA02828@mail.iol.ie> Has anyone had any luck with the command line version of msxsl? I am getting "not enough memory" errors on 95 and NT. If it working for others then I have a duff exe and will go get another one. Sean Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Fri Jan 9 15:42:48 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 16:59:51 2004 Subject: XML Books In-Reply-To: <199801091511.KAA24346@geode.ora.com> References: <34B58FF8.4A1A4379@pacbell.net> (message from Ray Waldin on Thu, 08 Jan 1998 18:48:24 -0800) Message-ID: <199801091530.QAA00951@cadis.de> Chris Maden wrote: > If there were XML publications, we'd refer you to them. As far as I > know, I have *all* of them. _Presenting XML_'s first half is a good > philosophical overview of the "why"s and "how"s of XML, but its > second half, as well as most of _XML Complete_, is already obsolete > because the specification is still changing. That's a little unfair. I wrote much of the second part of Presenting XML, and all the material about SGML is still 99% accurate. The list of applications I talked about is 90% accurate and the bibliographies and WWW pointers are still 95% accurate. The only major change is XS instead of XS. FYI, we are working on an update of the book now. "Dynamic Web Publishing" has a single chapter on XML; I condensed everything down into 22 pages, this too only covers XSL though. "HTML4 Unleashed, Professional Reference Edition" has five chapters and an appendix on XML, and there I *do* cover XS. Advertising over, I thought "XML Complete" was quite good. Not so much for the XML material but it does contain some very useful pieces of Java code (I'm not a Java programmer), and the chapters on Jade and DSSSL are still pretty good for entry material. Simon North xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Fri Jan 9 16:07:56 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:51 2004 Subject: XML Books In-Reply-To: <199801091530.QAA00951@cadis.de> (north@Synopsys.COM) Message-ID: <199801091612.LAA25172@geode.ora.com> [Simon North] > That's a little unfair. I wrote much of the second part of > Presenting XML, and all the material about SGML is still 99% > accurate. The list of applications I talked about is 90% accurate > and the bibliographies and WWW pointers are still 95% accurate. The > only major change is XS instead of XS. FYI, we are working on an > update of the book now. It's true that it's pretty good. It's coherent and accurate for when it was written, and I think (as I said on c.t.s) that you did as good a job as you could at the time. I'm glad you have on-line errata, and am happy to hear of an update. I still think it was a mistake to publish technical information so soon, and the average reader will not be able to tell what's accurate and what's not without the Web site, and most will accept what they read (despite Tim's notes in the preface). > "Dynamic Web Publishing" has a single chapter on XML; I condensed > everything down into 22 pages, this too only covers XSL though. > "HTML4 Unleashed, Professional Reference Edition" has five chapters > and an appendix on XML, and there I *do* cover XS. Interesting; I did not know that. (Now I suppose I need to go buy those, or at least read them over a mocha at Buns & Noodle.) > Advertising over, I thought "XML Complete" was quite good. Not so > much for the XML material but it does contain some very useful > pieces of Java code (I'm not a Java programmer), and the chapters on > Jade and DSSSL are still pretty good for entry material. _XML Complete_ appears to be written by somone who is quite good at Java, and has read the XML spec. XML (the acronym) is expanded wrong in the book (though the back cover is right). He defines the syntax of XML, but has no discussion of *how* to write XML - for instance, when to use attributes vs. subelements. He believes that a Java applet is necessary to make any document usable. He uses doubled slashes in URLs (like file:////c://xml//idlocator//idlocator2.xml) in the XML-link [sic] section so that the URLs can be handed straight to Java without processing. He discusses stylesheets, but introduces tag omission at the start of the DSSSL section, since he believes Jade can only use full SGML. His terminology is sloppy (consistent use of "tag" for "element"). Bad processing has mangled comment syntax into . He includes a chapter on "XML Image Handling". His modus operandi in each chapter is to introduce productions from the specification, a quick sample document that uses them, and then forty or so pages of Java to process it. And I don't believe that the author posted a single time to this list to clear up any potential misunderstandings. If it were more accurate, this could be a good way to learn how to process XML with Java. It is *not* a good way to learn XML. [Repeat disclaimer: I am also writing a book. But I would still rather that the competitors' information were accurate.] -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 9 16:56:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> References: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199801091656.LAA04378@unready.microstar.com> Michael Kay writes: > I think Marcus is wrong. The domain of application of SGML is > different from the domain of application of XML, and the > distinction between attributes and content which made sense in the > SGML world is extremely perplexing to those with a background in > data modelling and data structure design in other domains, who are > legitimate members of the XML community. You are absolutely right. The problem is that XML uses a fundamentally different approach to data modelling than that used in the relational world, and people accustomed to mapping their data onto two-dimensional tables may have trouble getting used to the idea that XML data structures (like those in object-oriented databases) can be hierarchical, repeatable, and even recursive or circular. > In the DTD I've been designing, for what it's worth, I'm currently > using content for nearly everything, with very little use of > attributes. The main reason is for future extensibility; elements > can always acquire a richer internal structure, while attributes > can't. The drawbacks (e.g. inability to specify any constraints on > values, default values, etc) don't actually lose me much, because > the constraints available for attributes are very limited anyway. This is a good approach. Even in SQL, the built-in constraints have little use: nearly everyone needs to write middleware to enforce complex business rules that cannot be captured by the standard types and constraints; likewise, XML implementations need middleware to enforce business rules for their content, while the XML parser may validate the structure. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at light.demon.co.uk Fri Jan 9 17:12:52 1998 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:59:51 2004 Subject: msxsl In-Reply-To: <199801091518.PAA02828@mail.iol.ie> Message-ID: In message <199801091518.PAA02828@mail.iol.ie>, Sean Mc Grath writes >Has anyone had any luck with the command line version of msxsl? I am getting >"not enough memory" errors on 95 and NT. If it working for others then >I have a duff exe and will go get another one. Yes, I downloaded it today, and it works fine on a creaking P75 with a mere 16MB of RAM. I converted a 25K XML document containing museum catalogue records into 250K of HTML, which proceeded to 'break' (or at least freeze) the browser when I asked it to expand all the markup. (However, msxsl doesn't like '.' within element names! Also, like other parsers it still lets through in the XML declaration - shouldn't we be forced to mend our ways?!) Richard. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Jan 9 17:23:30 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:51 2004 Subject: msxsl Message-ID: <3.0.32.19980109092219.0099ba60@pop.intergate.bc.ca> At 04:58 PM 09/01/98 +0000, Richard Light wrote: >Also, like other >parsers it still lets through in the XML declaration - >shouldn't we be forced to mend our ways?!) Not all of them, and yes. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Fri Jan 9 19:03:26 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 16:59:51 2004 Subject: XML Books In-Reply-To: <199801091612.LAA25172@geode.ora.com> References: <199801091530.QAA00951@cadis.de> (north@Synopsys.COM) Message-ID: <199801091621.RAA03210@cadis.de> Chris, re: XML Complete > If it were more accurate, this could be a good way to learn how to > process XML with Java. It is *not* a good way to learn XML. Agreed. I got to the 4th sentence where he said "HTML is a subset of SGML" before I decided that technically it wasn't going to be up to much .... I will look forward very much to seeing your book. Simon. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri Jan 9 21:39:59 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:59:51 2004 Subject: Seybold editors honor XML Message-ID: <199801092138.NAA07971@boethius.eng.sun.com> XML has been recognized with a 1997 Seybold Editors Award: http://www.seyboldpubs.com/News/Awards97/editors_awards.html To everyone in the XML community who has contributed to this effort over the past year and a half, congratulations! Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From patrik at allaire.com Fri Jan 9 22:40:49 1998 From: patrik at allaire.com (Patrik Muzila) Date: Mon Jun 7 16:59:51 2004 Subject: Extracting error infromation using MSXML.DLL parser Message-ID: <34B6A6EE.60BDCB05@allaire.com> I am using the C++ based MSXML parser (MSXML.DLL) from Delphi using COM. I got the parser working, however when an error occurs I am unable to extract the error infromation. In my Delphi code I am trying to mimic the C++ example from the Microsoft site : ... { // // Failed to parse stream, output error information. // IXMLError *pXMLError = NULL ; XML_ERROR xmle; hr = pDoc->QueryInterface(IID_IXMLError, (void **)&pXMLError); CHECK_ERROR(SUCCEEDED(hr), "Couldn't get IXMLError"); ASSERT(pXMLError); hr = pXMLError->GetErrorInfo(&xmle); SAFERELEASE(pXMLError); CHECK_ERROR(SUCCEEDED(hr), "GetErrorInfo Failed"); printf("%s: Error on line %d. Found %S while expecting %S\r\n", argv[0], xmle._nLine, xmle._pszFound, xmle._pszExpected); SysFreeString(xmle._pszFound); SysFreeString(xmle._pszExpected); SysFreeString(xmle._pchBuf); } ... I get stuck when trying to call the QueryInteface method as I cannot get the IID_IXMLError GUID. Is there anyone out there who has an idea how could it be done from Delphi? Also, could somebody from Microsoft comment on why is the extraction of the error informaton made this comlicated ? Patrik Muzila Allaire Corp. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Jan 10 03:29:19 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:51 2004 Subject: SAX/Java: Exceptions -- Lets take it out Message-ID: <00a701bd1d77$5d2484b0$2ee044c6@donpark> Lets remove the exceptions from the picture. Exceptions are nice but most folks don't know what to do with them other than dump it to stderr or throw up a dialog. Since there is no 'retry' in Java, Exceptions are of minimal use other than being a barking dog before an earthquake. It is also causing us a headache with no pleasant solution in sight and holding up the schedule. I propose that: 1. SAX application not be allowed to 'throw' exceptions. Instread, exceptions must be caught and handled within each method call (i.e. startElement). Exception can be ignored if they are non-fatal or reported to the parser. 2. Parser can rethrow the reported exception once the application returns. Don -----Original Message----- From: David Megginson To: David Ornstein Cc: xml-dev Mailing List Date: Friday, January 09, 1998 4:03 AM Subject: Re: SAX/Java: Exceptions, Again >David Ornstein writes: > > > If the parser is unable to open a file (let's say) I'm assuming that this > > will cause a java.io.IOException to be thrown and in a fully Java system, > > the parser might well ignore this and allow the application to catch it. > > This means that there's no need to have any kind of return code coming out > > of the Parse() function in the parser interface. In thinking about this > > for other languages, for languages that support exceptions, we're mandating > > that the SAX implementations in those languages use exceptions also (since > > there's nowhere for the return code and the return codes are no specified > > as part of SAX). And in languages that don't support exceptions, I'm at a > > loss to say what we'd do. > >For those languages, the parser would invoke the warning() or fatal() >callbacks to report an IO problem, and your handlers could set a >status variable. For all SAX implementations, any invocation of >fatal() means that your document is probably corrupt, is not >guaranteed complete, and should not be processed (except for >error-reporting purposes). > > >All the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Sat Jan 10 04:02:51 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <34B6F2BD.B0CB9078@allette.com.au> Michael Kay wrote: > I think Marcus is wrong. The domain of application of SGML is different from > the domain of application of XML, and the distinction between attributes and > content which made sense in the SGML world is extremely perplexing to those > with a background in data modelling and data structure design in other domains, > who are legitimate members of the XML community. I think you're making a few assumptions about how XML is going to be used - I (and probably many others in the SGML community) have every intention of re-badging SGML datasets as XML. The goal when designing an SGML dataset has always been to create something application and system independent. The fact that this can also be friendly to the web doesn't make me feel even a bit compromised. I will take your word for it that this is all very perplexing to those with different backgrounds, but those with their roots in SGML may also be "legitimate members of the XML community". > Philosophically - at least in terms of any ontological system I am aware of - > it is a nonsense, and can be justified only in terms of pragmatic assumptions > about information in the form of paper documents. If you're suggesting that SGML is only for paper, you're not even close. We have a number of sites where we have integrated databases with other sources and dynamically generated HTML. We may prefer to generate XML, but that has no impact on what format the source data takes. > I will resign myself to accepting XML as it is, but to suggest that its > deficiencies are there because SGML gurus decided ten years ago that they were > a good thing is unhelpful and not particularly flattering to the SGML gurus, > who designed it that way for a different purpose. I don't think you read what I said. I didn't say that that anyone decided anything was a good thing. In fact, what was being discussed has nothing to do with the standard - there is no definitive correct approach, however you may find the opinions of others who have faced similar situations useful. > In the DTD I've been designing, for what it's worth, I'm currently using > content for nearly everything, with very little use of attributes. The main > reason is for future extensibility; elements can always acquire a richer > internal structure, while attributes can't. The drawbacks (e.g. inability to > specify any constraints on values, default values, etc) don't actually lose me > much, because the constraints available for attributes are very limited anyway. The above holds perfectly true for an XML or SGML DTD. Why is it so much more relevant when you say it about XML than when someone might say it in an SGML publication? -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Jan 10 11:19:24 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:59:51 2004 Subject: Off topic (Re: Embedding Content as Element Content or As An Attribute Value) Message-ID: <199801101123.WAA30798@jawa.chilli.net.au> > From: Michael Kay > I think Marcus is wrong. The domain of application of SGML is different from > the domain of application of XML, and the distinction between attributes > and content which made sense in the SGML world is extremely perplexing to > those with a background in data modelling and data structure design in other > domains, who are legitimate members of the XML community. Philosophically - at > least in terms of any ontological system I am aware of - it is a nonsense, > and can be justified only in terms of pragmatic assumptions about > information in the form of paper documents. An efficient language will provide not only a modeling syntax but also contractions. For example, most character sets have precomposed characters (i.e. the base character with its accent) for the most common accented characters of the locale, as well as or rather than factoring all the accents our into separate non-spacing character codes. Attributes in the main represent such a contraction mechanism. The designer of a document type may also decide to use elements for the most interesting structure (as far as that designer is concerned) and use attributes for other decorations. Attributes can have guaranteed unique names as their values (IDs), which is not available for elements. None of these three things (contraction, decoration, uniqueness) have much if anything to do with the metaphysics of the thing modeled or the nature of paper publications. I doubt any of these things will actually perplex good philosophers or language designers. > In the DTD I've been designing, for what it's worth, I'm currently using > content for nearly everything, with very little use of attributes. > The main reason is for future extensibility; elements can always acquire > a richer internal structure, while attributes can't. This is a good reason. But not everyone has your documents or needs or bent of mind. Let 1000 flowers blossom. The only time I investigated DTDs which did not use any attributes, it turned out to be because the processing software of the company involved did not support them easily: it was a (slack) pragmatic reason. > The drawbacks (e.g. inability to specify any constraints on values, default > values, etc) don't actually lose me much, because the constraints available > for attributes are very limited anyway. SGML was revised last year to allow explicit arbitrary typing of attributes. I hope XML will get this too sometime. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Jan 10 13:09:18 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:51 2004 Subject: Off topic (Re: Embedding Content as Element Content or As An Attribute Value) Message-ID: <199801101309.NAA19357@GPO.iol.ie> >An efficient language will provide not only a modeling syntax but also contractions. ... In the feb issue of Dr. Dobbs there is a fascinating interview with Larry Wall, creator of Perl. Being half linguist/half software genius he has some interesting things to say about the way languages evolve and the tension between ease of expression and language syntax. I found a resonance between what he was saying and the attribute/element content chestnut in SGML/XML. BTW, there is an article on XML by yours truly in the same issue. Sean Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From neil at bradley.co.uk Sat Jan 10 18:28:42 1998 From: neil at bradley.co.uk (Neil Bradley) Date: Mon Jun 7 16:59:51 2004 Subject: XML Link Questions (again) Message-ID: <199801101828.SAA18982@andromeda.ndirect.co.uk> I posted the following before, but got no response. Perhaps not the the XML syntax spec/ if finsihed, linking issues will get more attention. .............................. Having read the XML Link specification, I have a few questions. In both the XML Language and XML Link schemes, a target element can be identified by its ID value. Is it possible using XML Link to target an element in a document that does not have a DTD, and if so, how is the target attribute identified, by a fixed attribute name of 'Id'? Does STRING(1,'testing',0) only select the first character of 'testing', or the whole word. If the first character, can DITTO() be used to specify a range from 't' to 'g', and is DITTO() assumed to start from the enclosing element or from the first character of testing, in which case can DITTO() actually find it? What impact does case-sensitivity have on the default and replacement attribute names and values. Is 'HREF' or 'href' the default resource locator name, and must 'XML-ATTRIBUTES' (or 'xml-attributes') contain case-sensitive values ('HREF TARGET TITLE REFTITLE' or 'href target title RefTitle'). ................................ Neil. ----------------------------------------------- Neil Bradley - Author of The Concise SGML Companion. neil@bradley.co.uk www.bradley.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Jan 10 18:49:33 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:51 2004 Subject: XML Link Questions (again) Message-ID: <3.0.32.19980110104626.0099b900@pop.intergate.bc.ca> At 02:52 PM 10/01/98 +0000, Neil Bradley wrote: >In both the XML Language and XML Link schemes, a target element can be >identified by its ID value. Is it possible using XML Link to target an >element in a document that does not have a DTD, and if so, how is the >target attribute identified, by a fixed attribute name of 'Id'? There's no way to know something is an ID element without access to a DTD or 'inside knowledge' about that particular document type. This is a problem. >Does STRING(1,'testing',0) only select the first character of >'testing', or the whole word. If the first character, can DITTO() be >used to specify a range from 't' to 'g', and is DITTO() assumed to >start from the enclosing element or from the first character of >testing, in which case can DITTO() actually find it? It indicates the *point* in the document where the 'T' starts; it does not select any characters. >What impact does case-sensitivity have on the default and replacement >attribute names and values. Is 'HREF' or 'href' the default resource >locator name, and must 'XML-ATTRIBUTES' (or 'xml-attributes') contain >case-sensitive values ('HREF TARGET TITLE REFTITLE' or 'href target >title RefTitle'). XML is case-sensitive. The most recent Link draft doesn't reflect this fact. It is highly probable that we'll end up with lower-case throughout. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tadmc at metronet.com Sun Jan 11 02:26:19 1998 From: tadmc at metronet.com (Tad McClellan) Date: Mon Jun 7 16:59:51 2004 Subject: Off topic (Re: Embedding Content as Element Content or As An Attribute Value) In-Reply-To: <199801101309.NAA19357@GPO.iol.ie> from "Sean Mc Grath" at Jan 10, 98 01:38:21 pm Message-ID: <199801110158.TAA01548@metronet.com> A non-text attachment was scrubbed... Name: not available Type: text Size: 1161 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980111/3478565c/attachment.bat From Jon.Bosak at eng.Sun.COM Sun Jan 11 05:58:08 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:59:51 2004 Subject: XML conference deadline extended Message-ID: <199801110556.VAA08719@boethius.eng.sun.com> I've been asked to inform this group that the deadline previously set for papers to be presented at the March XML Conference in Seattle has been extended to this Thursday, January 15. See http://www.gca.org/conf/xmlcon97/ for more information about the conference. Note that this refers to presentations during the main part of the conference, which runs from March 24-26. There is also an XML Developers' Day scheduled for Friday, March 27, which I will be chairing as a separate track of its own for hard-core technical demonstrations of work in progress, especially projects intended for the public domain. I will be putting out a separate call for Dev Day contributions much later -- probably around the middle of February. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dvp4c at jefferson.village.virginia.edu Sun Jan 11 14:46:42 1998 From: dvp4c at jefferson.village.virginia.edu (Daniel Pitti) Date: Mon Jun 7 16:59:51 2004 Subject: MSXML 1.8 December Release question Message-ID: <3.0.1.32.19980111094444.00703190@jefferson.village.virginia.edu> I have installed MSXML 1.8 over an earlier working version and am getting the following error message when running it from command line: "ERROR: java.lang.NoSuchMethodError: com/ms/xml/om/Document: method setLoadExternal(Z)V not found" Has anyone else encountered this error and successfully dealt with it? I am running IE4 on a Windows95 platform. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 11 15:33:40 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <01bd1d06$8b974d80$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <34B85B31.AE852BB6@technologist.com> Michael Kay wrote: > > I think Marcus is wrong. The domain of application of SGML is different > from the domain of application of XML, and the distinction between > attributes and content which made sense in the SGML world is extremely > perplexing to those with a background in data modelling and data > structure design in other domains, who are legitimate members of > the XML community. Philosophically - at least in terms of any > ontological system I am aware of it is a nonsense, and can be > justified only in terms of pragmatic assumptions about information in > the form of paper documents. The elements and attributes distinction is indeed driven by pragmatics, but those pragmatics have nothing to do with print. Members of the database community have asked us to strengthen the expressive power of attributes so that they can be used to more accurately model attribute of an object in an OOP system. I'm surprised, though that you would claim that the distinction between "hasproperty" and "containsobject" is missing in every ontological system you are aware of. Clearly attributes can be used to model that distinction (for applications where it is relevant). Anyhow, I'm curious why you think that the domain of XML is that different from SGML. XML is SGML for the Web. > In the DTD I've been designing, for what it's worth, I'm currently > using content for nearly everything, with very little use of > attributes. The main reason is for future extensibility; > elements can always acquire a richer internal structure, while > attributes can't. The drawbacks (e.g. inability to specify any > constraints on values, default values, etc) don't actually lose me much, > because the constraints available for attributes are very limited anyway. You've missed the point that attributes have an important feature w.r.t extensibility. An unknown attribute just disappears in processing, since they are named by role. In every XML/SGML processing system I am aware of (including XML DTDs, DSSSL, XSL and probably SAX), it is harder to handle unknown elements because their GI could represent either their role or their object type. If it is an unknown role, it is probably safe to ignore them, but if it is an unknown object type "filling in for" another object type in the same role, then you should flag an error or lookup a handler or do something else. ... ... I can add a role without harming anything: ... ... ... Older processing software can safely ignore URLs. But if I change an object type things break: ... ... ... How do I now process this? The old software depended on the DATE being available. What the database people have asked for (and what I also want) is the ability to have attributes which retain their distinction between role and type, but also can have tree internal structure. This gives us the best of both worlds, as well as the ability to choose elements or attributes based on ontological systems and not pragmatics. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Jan 11 19:54:59 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <199801111954.TAA32451@GPO.iol.ie> >Michael Kay wrote: >> the XML community. Philosophically - at least in terms of any >> ontological system I am aware of it is a nonsense.... [Paul Prescod] > >You've missed the point that attributes have an important feature w.r.t >extensibility. An unknown attribute just disappears in processing, since >they are named by role. In every XML/SGML processing system I am aware >of (including XML DTDs, DSSSL, XSL and probably SAX), it is harder to >handle unknown elements because their GI could represent either their >role or their object type. Attributes have a couple of other "attributes" that feature in the list of pragmatics. 1) An attribute is associated with an element type and typically communicated as such by parsers. Thus you do not need to establish a context in processing software. One less piece of state-space. Say element types foo and bar have an attribute "n" and we wish to print out its value for foo elements only. We have this (in pseudo-something) : element foo { print attributes["n"] } Rather than this: element n { if parent == "foo" print GetDataDescdendants() } 2) Attributes can be added without changing the instance. This is a sort of "out of line linking" that allows new layers of meaning to be added to elements perhaps years after the content has been created. Although this approach has its limits, it can be hugely useful as Eliot Kimber et. al. have demonstrated on this list many times. It will be possible, I imagine, to do this sort of on-the-fly layering with sub-elements rather than attributes in, say, XMLData. But the attribute approach just feels right somehow:-) Final comment: I think the most exasperating thing for classically trained data modellers approaching SGML/XML is that a lot of the established terminology of the field pops up meaning completly different things. vis. Entity/Attribute modelling. Argh! Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Sun Jan 11 22:09:56 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:51 2004 Subject: XML Link Questions (again) Message-ID: <3.0.5.32.19980111171049.0095fb30@village.doctools.com> (I apologize if this is a re-send. I keep getting bounce messages from xml-dev, so I'm not sure if this has gone through.) >At 02:52 PM 10/01/98 +0000, Neil Bradley wrote: >>In both the XML Language and XML Link schemes, a target element can be >>identified by its ID value. Is it possible using XML Link to target an >>element in a document that does not have a DTD, and if so, how is the >>target attribute identified, by a fixed attribute name of 'Id'? > >There's no way to know something is an ID element without access to >a DTD or 'inside knowledge' about that particular document type. >This is a problem. There is an issue in the XLL Issues List (at , which requires a W3C member password) on this topic. We'll be returning to XLL work Real Soon Now. >>Does STRING(1,'testing',0) only select the first character of >>'testing', or the whole word. If the first character, can DITTO() be >>used to specify a range from 't' to 'g', and is DITTO() assumed to >>start from the enclosing element or from the first character of >>testing, in which case can DITTO() actually find it? > >It indicates the *point* in the document where the 'T' starts; it >does not select any characters. Actually, the current understanding of the editors is that STRING selects a single character, and that DITTO cannot be used usefully to select a multiple-character string. The XLL Issues List also has an issue about this, with some suggestions. Basically, STRING currently isn't optimized to select a whole string, which is a likely thing to want to do. My XML tutorial at SGML/XML '97 covered the state of the XLL and XPointer art in some detail (though it will no doubt be obsoleted soon). If you want to check out my slides (currently PowerPoint '97 is what's available), you can get them from the XML Resources area of . >>What impact does case-sensitivity have on the default and replacement >>attribute names and values. Is 'HREF' or 'href' the default resource >>locator name, and must 'XML-ATTRIBUTES' (or 'xml-attributes') contain >>case-sensitive values ('HREF TARGET TITLE REFTITLE' or 'href target >>title RefTitle'). > >XML is case-sensitive. The most recent Link draft doesn't reflect this >fact. It is highly probable that we'll end up with lower-case throughout. I believe that, XLL being an *application* of XML (that is, an XML-based markup language), it can make its own application conventions about what must be case-sensitive and what can be case-insensitive. We do need to decide this for XLL, and I agree with Tim that we're likely to end up with case-sensitive lowercase. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Mon Jan 12 09:08:13 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:59:51 2004 Subject: XML Link Questions (again) In-Reply-To: "Eve L. Maler"'s message of Sun, 11 Jan 1998 17:10:49 -0500 References: <3.0.5.32.19980111171049.0095fb30@village.doctools.com> Message-ID: "Eve L. Maler" writes: > >At 02:52 PM 10/01/98 +0000, Neil Bradley wrote: > >>In both the XML Language and XML Link schemes, a target element can be > >>identified by its ID value. Is it possible using XML Link to target an > >>element in a document that does not have a DTD, and if so, how is the > >>target attribute identified, by a fixed attribute name of 'Id'? > > > Tim Bray answered: > >There's no way to know something is an ID element without access to > >a DTD or 'inside knowledge' about that particular document type. > >This is a problem. > > There is an issue in the XLL Issues List (at > , which > requires a W3C member password) on this topic. We'll be returning > to XLL work Real Soon Now. Note that the XSL proposal addressed this issue by providing a means in style sheets for identifying ID attributes in target documents. A similar mechanism might make sense for XLL. ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Jan 12 11:34:38 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value Message-ID: <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> -----Original Message----- From: Paul Prescod To: xml-dev@ic.ac.uk Date: 11 January 1998 15:50 Subject: Re: Embedding Content as Element Content or As An Attribute Value Paul Prescod wrote, inter alia: >I'm surprised, though that you would claim that the distinction between >"hasproperty" and "containsobject" is missing in every ontological >system you are aware of. That distinction is certainly one of the traditional difficulties of all data modelling, and thanks, yes, it's helpful to see it in those terms. But I don't think the XML/SGML distinction between attributes and content is quite the same. If we consider the difference between: versus Ham and Mushroom then the distinction is not between an attribute and a contained object, but between "ordinary" attributes and one "special" attribute which we call the content of the object. The ontological systems I am aware of do not treat one of the attributes of an object as being special in this way. In fact the representation I am using is more like: $12.00 Ham and mushroom In other words, I am treating all the attributes as "contained objects", and the reason I am doing this is that in my particular domain, some of the attributes may carry additional information (sometimes called "facets") e.g. where did this information come from, how reliable is it, and when did it last change. I apologize for my faux-pas in the suggestion that the scope of application of XML was wider than that of SGML. I had forgotten, of course, that to SGML insiders SGML is applicable to everything - it is only outsiders who draw boundaries around it :-) PS: can I ask for some advice? In my DTD for the above, I want to say that the element PIZZA has no immediate character content other than ignorable white space. I feel sure it must be possible to say this, but I haven't found out how. The alternative seems to be disallow character content entirely, and generate documents using a layout such as: $12.00Ham and mushroom but I find it hard to believe this is what the designers intended. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 12 12:00:54 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:51 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> References: <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199801121200.HAA00278@unready.microstar.com> Michael Kay writes: > In fact the representation I am using is more like: > > $12.00 > Ham and mushroom > [...] > PS: can I ask for some advice? In my DTD for the above, I want > to say that the element PIZZA has no immediate character > content other than ignorable white space. I feel sure it must > be possible to say this, but I haven't found out how. Ignorable whitespace is always allowed -- that's why it's useful for an API to distinguish it from regular character content when there's a DTD available. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Mon Jan 12 13:59:25 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:59:52 2004 Subject: New release of xslj: XSL to augmented DSSSL translator Message-ID: <10726.199801121359@naomi.cogsci.ed.ac.uk> What I hope will be the final beta release of XSLJ is now available. XSLJ is my translator from the XML style language proposed in 'A Proposal for XSL' to that augmented version of DSSSL which is supported by the test release of JADE. This release incorporates a number of minor bug fixes and a small increase in conformance to the proposal (mixed content is now allowed in style sheet 'actions'). See http://www.ltg.ed.ac.uk/~ht/xslj.html for more information and downloading. ht -- Henry S. Thompson, Human Communication Research Centre, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.cogsci.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 12 14:53:14 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft Message-ID: <199801121452.JAA00397@unready.microstar.com> I am happy to announce the first draft of SAX, the Simple API for XML, together with a Java reference implementation and drivers for the major Java-based XML parsers. SAX is a simple, common, event-based API for XML parsers written in object-oriented languages like Java, C++, or Perl5 (the reference implementation is in Java). SAX is similar in philosophy to JavaSoft's JDBC -- it allows you to write an application once, then plug in any XML parser that has a SAX driver, just as the JDBC allows you to plug in any SQL database that has a JDBC driver. The SAX API was developed collaboratively during a month of discussion on the XML-DEV mailing list. As an event-based interface, SAX is complementary to the proposed (tree-based) Document Object Model interface; in fact, it should be possible to implement a basic DOM interface on top of SAX, or a basic SAX interface on top of DOM. Event-based interfaces provide very simple, low-level access to parsing events, without straining system resources. For SAX documentation, a draft spec, a reference implementation of the SAX interfaces in Java, SAX front-end drivers for the major Java XML parsers (NXP, Lark, MSXML, and Ælfred), and a sample SAX application, please see http://www.microstar.com/XML/SAX/ I would like people to play with this for a month or two, during which time I'll collect suggestions and bug reports; after that, with luck, we can come up with a final draft. I may continue to work on the SAX drivers during that time, but I want to leave the rest alone for a while. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Jan 12 17:21:46 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft Message-ID: <01bd1f7e$91914900$1e09e391@mhklaptop.bra01.icl.co.uk> David Meggison wrote: >I am happy to announce the first draft of SAX, the Simple API for XML, >together with a Java reference implementation and drivers for the >major Java-based XML parsers. Looks good. My first attempts to get the demo app to run failed saying: SAX parser class com.microstar.sax.AElfredDrivercannot be loaded. I fixed the problem by putting the relevant directory on DevClassPath as well as ClassPath. This is using the IE4 Java VM under Win95. I extracted four downloads (Alfred beta 0.5, SAX itself, the SAX drivers, and the demo app) into the same directory c:\aelfred5 and put that on the ClassPath and DevClassPath using regedit. My next problem was in the filename handling. This is what happens: C:\aelfred5>jview SAXdemo com.microstar.sax.AElfredDriver mydoc.xml Start document Resolving external entity: file://localhostC:\aelfred5/mydoc.xml ERROR: java.io.FileNotFoundException: ftp://localhostC/mydoc.xml I fixed this by changing "localhost" to "localhost/" in the makeAbsoluteURL method, but I don't expect this fix is portable. I don't expect much difficulty converting my two AElfred apps. I had used the "isSpecified()" functionality to distinguish between specified and defaulted attributes, but I can live without it if I have to. Regards, and congratulations on more excellent software, Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davidsch at microsoft.com Mon Jan 12 17:27:27 1998 From: davidsch at microsoft.com (David Schach) Date: Mon Jun 7 16:59:52 2004 Subject: Extracting error infromation using MSXML.DLL parser Message-ID: <5CEA8663F24DD111A96100805FFE658702B89CEB@red-msg-51.dns.microsoft.com> The IID is in the SDK. It's defined in the same file as IID_IXMLDocument. > -----Original Message----- > From: Patrik Muzila [SMTP:patrik@allaire.com] > Sent: Friday, January 09, 1998 2:39 PM > To: xml-dev@ic.ac.uk > Subject: Extracting error infromation using MSXML.DLL parser > > I am using the C++ based MSXML parser (MSXML.DLL) from Delphi using > COM. I got the parser working, however when an error occurs I am > unable to extract the error infromation. In my Delphi code I am trying > to mimic the C++ example from the Microsoft site : > > ... > > { > // > // Failed to parse stream, output error information. > // > IXMLError *pXMLError = NULL ; > XML_ERROR xmle; > > hr = pDoc->QueryInterface(IID_IXMLError, (void **)&pXMLError); > CHECK_ERROR(SUCCEEDED(hr), "Couldn't get IXMLError"); > > ASSERT(pXMLError); > > hr = pXMLError->GetErrorInfo(&xmle); > SAFERELEASE(pXMLError); > CHECK_ERROR(SUCCEEDED(hr), "GetErrorInfo Failed"); > > printf("%s: Error on line %d. Found %S while expecting %S\r\n", > argv[0], > xmle._nLine, > xmle._pszFound, > xmle._pszExpected); > > SysFreeString(xmle._pszFound); > SysFreeString(xmle._pszExpected); > SysFreeString(xmle._pchBuf); > } > > ... > > I get stuck when trying to call the QueryInteface method as I cannot get > the IID_IXMLError GUID. Is there anyone out there who has an idea how > could it be done from Delphi? Also, could somebody from Microsoft > comment on why is the extraction of the error informaton made this > comlicated ? > > Patrik Muzila > Allaire Corp. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Jan 12 19:22:55 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft References: <199801121452.JAA00397@unready.microstar.com> Message-ID: <34332965.4A923755@infinet.com> David Megginson wrote: > I am happy to announce the first draft of SAX, the Simple API for XML, > together with a Java reference implementation and drivers for the > major Java-based XML parsers. > > SAX is a simple, common, event-based API for XML parsers written in > object-oriented languages like Java, C++, or Perl5 (the reference > implementation is in Java). SAX is similar in philosophy to > JavaSoft's JDBC -- it allows you to write an application once, then > plug in any XML parser that has a SAX driver, just as the JDBC allows > you to plug in any SQL database that has a JDBC driver. The SAX API > was developed collaboratively during a month of discussion on the > XML-DEV mailing list. > > As an event-based interface, SAX is complementary to the proposed > (tree-based) Document Object Model interface; in fact, it should be > possible to implement a basic DOM interface on top of SAX, or a basic > SAX interface on top of DOM. Event-based interfaces provide very > simple, low-level access to parsing events, without straining system > resources. > > For SAX documentation, a draft spec, a reference implementation of the > SAX interfaces in Java, SAX front-end drivers for the major Java XML > parsers (NXP, Lark, MSXML, and ?lfred), and a sample SAX application, > please see > > http://www.microstar.com/XML/SAX/ > > I would like people to play with this for a month or two, during which > time I'll collect suggestions and bug reports; after that, with luck, > we can come up with a final draft. I may continue to work on the SAX > drivers during that time, but I want to leave the rest alone for a > while. > > All the best, > > David In an hour I quickly did my best to map the initial SAX draft to CORBA 2.0 IDL as past discussion on an IDL form of SAX on this mailing list seemed to generate interest in the idea. The mapping is not exact, and also faces some serious design flaws as far as distributive computing, especially since it is event based and may generate a lot of remote invocations via the callbacks to the client application where the server is the XMLProcessor. Returning some sort of tree based "struct" structure of the XML probably would be a much more scalable solution for large documents. Nonetheless, this is a simple attempt at mapping SAX-J to IDL. Any comments would be greatly appreciated. Thanx, Tyler // This is an initial attempt to map the current SAX-J draft to CORBA 2.0 // IDL. The motivation for this is that many people may want to do // their XML processing on a remote server, rather than with the client, // especially if the client is a thin NC or some other computing device. // Most of the mappings are essentially exactly identical, however // the only real changes are in that there java.lang.Exception is mapped // to a class called XMLException and that AttributeMap is mapped to an // array of structs called Attributes. The reason for this, is that in // CORBA the only way you can pass things by value is using structs. I // would think that this would be a good idea to have this information // returned in a struct rather than a CORBA Object. I used Visigenic's // idl2java compiler to see if the IDL was syntactically correct. You // can also use SUN's IDL2Java compiler, which will generate identical // java interfaces, but different stub classes as well as helper and // holder classes. module org { module xml { module sax { typedef sequence Chars; exception XMLException {}; struct Attribute { wstring name; wstring value; boolean entity; boolean notation; boolean id; boolean idRef; wstring entityPublicID; wstring entitySystemID; wstring notationNameID; wstring notationPublicID; wstring notationSystemID; }; typedef sequence Attributes; interface EntityHandler { wstring resolveEntity(in wstring ename, in wstring publicID, in wstring systemID) raises(XMLException); void changeEntity(in wstring systemID) raises(XMLException); }; interface DocumentHandler { void startDocument() raises(XMLException); void endDocument() raises(XMLException); void docType(in wstring name, in wstring publicID, in wstring systemID) raises(XMLException); void startElement(in wstring name, in Attributes attributes) raises(XMLException); void endElement(in wstring name) raises(XMLException); // It would be more straightforward if "char[] ch" were instead "String s" void characters(in Chars ch, in long start, in long length) raises(XMLException); // It would be more straightforward if "char[] ch" were instead "String s" void ignorable(in Chars ch, in long start, in long length) raises(XMLException); void processingInstruction(in wstring name, in wstring remainder) raises(XMLException); }; interface ErrorHandler { void warning(in wstring message, in wstring systemID, in long line, in long column) raises(XMLException); void fatal(in wstring message, in string systemID, in long line, in long column) raises(XMLException); }; interface Parser { void setEntityHandler(in EntityHandler handler); void setDocumentHandler(in DocumentHandler handler); void setErrorHandler(in ErrorHandler handler); void parse(in wstring publicID, in wstring systemID) raises(XMLException); }; }; }; }; xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Jan 12 20:30:48 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft In-Reply-To: <199801121452.JAA00397@unready.microstar.com> Message-ID: <3.0.1.16.19980112202346.3847a980@pop3.demon.co.uk> At 09:52 12/01/98 -0500, David Megginson wrote: >I am happy to announce the first draft of SAX, the Simple API for XML, >together with a Java reference implementation and drivers for the >major Java-based XML parsers. I am writing this before the download - for which I am prepared to forgo my tea - it sounds so exciting. Many thanks to David for this work. NOW... The onus is on us to make sure that we capitalise on this. PLEASE use it. Personally I would much rather have ANY feedback rather than none. We've had contributions to the list stressing how valuable this parser work has been - let's show this in practice. P. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Jan 12 20:48:29 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:52 2004 Subject: Embedding Content as Element Content or As An Attribute Value In-Reply-To: <199801121200.HAA00278@unready.microstar.com> References: <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3.0.1.16.19980112204127.3527e866@pop3.demon.co.uk> At 07:00 12/01/98 -0500, [many people] wrote: [... extended discussion of the attribute vs. content saga... a frequent theme on *ML discussion groups :-)] My reading of the three X*L specs is that - on the whole - there is no syntactic reasons why one approach should be followed rather than another. I approve of this. However there appear to be the following cases where the X*L *syntax* favours developers who use one method rather than another. In XML the attributes xml:lang and xml:space are *inherited*. The intention is that all DESCENDANTs of ELEMENTs with such attributes behave as if those attributes were present with the same name and value. The spec is deliciously Delphic on the *mechanism* of inheritance. However, since the same approach is required for XLL implementers to follow, there is (IMO) considerable pressure for *application* authors (not parser authors) to develop an inheritance tool. Once developed, this tool is then available for other attributes not hardcoded into the spec. Personally I would very much favour guidance from the WG as to how they see such inheritance being implemented, as it could perhaps unintentionally lead to somewhat different semantics. My current assumption is that the attributes are not cloned into the DESCENDANTs, but that each descendant may have to be smart enough to ask its grandma whether she has got any exciting attributes. This may be non-trivial, especially where documents are being cut-and-pasted... [Personally I hate things I can't see - and have said so :-)]. The other main area is that XLL-TEI defines substring analysis of mixed content but NOT attribute values. Thus if you know you may want to search for a substring in a document it may be valuable to use a mechanism that casts it into content rather than attributValue. Example: XML books allows one to search for 'book' in "XML books" but not in "bookshop". [For newcomers, you can search for an attribute with the value "bookshop", and all searches are case-sensitive.] The XSL spec seems to be very powerful for both content and attributes, and I think both are well catered for. [Of course it also defeines a certain number of hardcodable attributes such as ID, CLASS, etc.] Remember that XLL and XML are still fluidish. If I have missed anything else in the specs I'd be grateful... P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 12 20:50:45 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:52 2004 Subject: Embedding Content as Element Content or As An Attribute Value References: <01bd1f4e$18355360$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <34BA36FA.4B836102@technologist.com> Michael Kay wrote: > Paul Prescod wrote, inter alia: > >I'm surprised, though that you would claim that the distinction between > >"hasproperty" and "containsobject" is missing in every ontological > >system you are aware of. > > That distinction is certainly one of the traditional difficulties of all > data modelling, and thanks, yes, it's helpful to see it in those terms. > But I don't think the XML/SGML distinction between attributes and > content is quite the same. If we consider the difference between: > > > versus > Ham and Mushroom True enough, SGML does not require you to use its features to model this (or any) ontological distinction. It does not even encourage you to do so. My point was just that attributes often allow you to do so, if it makes sense in your problem domain. Unfortunately the fact that they cannot have sub-structure often constrains their use in this way. > The alternative seems to be disallow character content entirely, As David M. pointed out, XML does have the concept of ignorable whitespace and it was intended to solve precisely this problem. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco Art is always at peril in universities, where there are so many people, young and old, who love art less than argument, and dote upon a text that provides the nutritious pemmican on which scholars love to chew. -- Robertson Davies in "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at interlog.com Tue Jan 13 03:50:22 1998 From: smith at interlog.com (Chris Smith) Date: Mon Jun 7 16:59:52 2004 Subject: Open Trading Protocol Standard For E-Commerce Now Available Message-ID: Included below is today's press release on the Open Trading Protocol (OTP). A lot of the background work has been done over the last year and a half. XML came along at just the right time! The fit is, I think, very good. There is input from this group, as we were trying to ensure that we didn't create something too difficult to work with. The web site is up and running. Feel free to browse, and if you have questions, I will either try to field them, or, with a longer turn around, run them past the working group. The release is version 0.9 - we felt it was appropriate to get public feedback, and, if possible, a round of first implementations before we could confidently issue version 1. ------------------------------------------------------- FOR IMMEDIATE RELEASE Release Date: January 12, 1998 Contact: Edward Dixon, MasterCard International, 914/249-5028 or email edward_dixon@mastercard.com Contact: Robin O'Kelly, Mondex International Limited, +44 (0) 171 557 5036 or email Robin.O'Kelly@mondex.com Open Trading Protocol Standards for Internet Commerce Now Available For Comment and Implementation --------------------------------------------------------- OTP Consortium continues to grow with over 30 companies now propelling the development of this standard for retail trading over the Internet PURCHASE, New York, Jan. 12, 1998 For the first time, a global standard for retail trade on the Internet has been published and is available now on the Internet. The complete specification for the Open Trading Protocol (OTP) developed by the leaders in Internet commerce has been posted for public comment, and pilot implementation and trials. The OTP website (http://www.otp.org) gives access to the OTP specification, informational material as well as an email forum to facilitate the exchange of comment and recommendations by merchants, vendors and financial institutions. The website will also provide any updates to the specification that come as a result of this exchange. The OTP standards enable a consistent framework for multiple forms of electronic commerce, ensuring an easy-to-use and consistent consumer purchasing experience regardless of the payment instrument or software and hardware product used. The protocol is freely available to developers and users, and builds on XML, an emerging standard for information exchange on the Internet. As a set of truly open standards, the protocol is not "owned" by any one company, and its development will be managed by an appropriate independent organization. "The publishing of the OTP standards represents an important milestone in the development of electronic commerce," said Michael Keegan, CEO, Mondex International, a consortium member. "If the potential of electronic commerce is to be fully realized, it will flourish only in a truly open and interoperable environment, an environment that the OTP standards provide." "Electronic commerce is really starting to click. The OTP specification has been designed to support and complement other specifications like SETtm (Secure Electronic Transactions) and the EMV (Europay, MasterCard and Visa) chip card specification to offer a consistent online interaction for consumers, merchants and banks using any number of payment options," said Steve Mott, Senior Vice President, Electronic Commerce/New Ventures, MasterCard International. "Standards are the key to the Internet, and they're the key to making it easy for our customers to become e-businesses," said Mark Greene, vice president, IBM Internet payments and certification. "OTP has done a significant service to the industry by making this standard available for all of us." "Sun has been involved in the development of the Open Trading Protocol specification and we're delighted to see it released to the public," said Patrice Peyret, director of the Java Commerce Group for JavaSoft, a business unit of Sun Microsystems. "The industry needs concrete technologies like OTP that will enable electronic commerce. Like the Java Card API, we expect the Open Trading Protocol to provide a reliable foundation for transmitting a high volume of electronic transactions across almost any network, including the Internet, and we look forward to its future development." The OTP initiative, established in anticipation of what is expected to be a multi-billion dollar industry by the turn of the century, now gains the support of DigiCash and SIZ, two leaders in retail electronic payment systems (see Notes to Editors). Today's announcement that DigiCash, a pioneer in the development of electronic payment systems, and SIZ Computer Science Center of the German Savings Banks, a German leader in smart card introductions, especially GeldKarte, will both support OTP as a global standard for retail trade on the Internet demonstrates the growing momentum for the standard. Thirty other members of the OTP consortium have brought OTP from a concept to a significant standard in the Internet marketplace in just nine months. "The OTP will ensure that traders, retailers and shoppers will all speak the same electronic language," said Michael C. Nash, president and CEO, DigiCash. "We fully anticipate that Internet shopping will quickly become more popular in the next few years, and the OTP standards will help that process." Alexander von St?lpnagel, CEO of SIZ, agreed. "The most important need for the success of Internet commerce is the existence of a framework which handles the whole business transaction and where different payment systems can fit in. OTP is this framework which can ensure that all business partners are provided with systems for global and local Internet commerce." The OTP standards were pioneered by AT&T, Hewlett-Packard Company, MasterCard International, Mondex International, and Open Market Inc., as well as all of Mondex International's shareholding banks. Others who have combined with the effort to develop the OTP include Hitachi, Royal Bank of Canada, BT, Canadian Imperial Bank of Commerce, CyberCash, Dot Matrix, First Data Corp., Fujitsu, GIS, Hyperion, IBM, Information & Database Network, Intertrader, JCP Ltd., MPACT Immedia, Inc., Mercantec, Netscape Communications Corporation, Nokia, Oracle, Smart Card Integrations Ltd., Spyrus, Sun Microsystems, Unisource, VeriFone and Wells Fargo. The Internet marketplace is growing rapidly and is expected to be a $200 billion industry by 2000, according to Forrester Research analysts. However, this marketplace is critically dependent upon commonly accepted universal standards for trading, security and commercial operations. This protocol will reduce the merchant cost of setting up and doing business on the Internet, while retaining the flexibility to offer products and services in differentiated and innovative ways. With the number of people who have purchased online currently reaching 10 million, according to Nielsen Media Research, the OTP standards will play a key role in rapidly increasing that number by providing greater confidence in the efficiency and reliability of the Internet marketplace. With the benefits of online shopping greater access to information and increased control in searching for goods and services fast becoming evident, consumers, operating within the OTP standards, are expected to shop online with confidence and ease, regardless of the payment method, instrument, or software/hardware components they use. The OTP standards specify how Internet trading transactions can occur easily, safely and efficiently for all parties, independent of the method of payment very similar to the trading environment in the physical world. Many existing protocols such as Secure Electronic Transaction (SET), a global industry standard for secure credit card payments over the Internet (see Notes to Editors), focus on making a payment. The OTP standards complement but don't replace these protocols by providing a clearly understood set of rules that cover the following: * offers for sale; * agreements to purchase; * payment (by using existing payment products, such as SET, Mondex, CyberCash, GeldKarte, etc.); * the transfer of goods and services; * delivery; * receipts for purchases; * multiple methods of payment; * support for problem resolution; * payment brand and protocol selection. In addition to providing consumers with a consistent approach to trading on the Internet, consumers will also have records of purchases which could be used for tax purposes, making expense claims, feeding into financial management software or sending a claim back to a merchant to solve a problem. Notes to Editors: SIZ focuses on setting standards for the German Savings Bank Organization (GSBO) in terms of architecture, methodology and products, providing consulting services and coordinating joint application development of IT centers (but not developing applications on its own). This is done in close cooperation with the IT centers and the Deutscher Sparkassen- und Giroverband (DSGV). According to its mission, SIZ is basically covering all of IT, with special emphasis on technology (systems, telecommunications, office), security, application coordination and application provision. One of SIZ's tasks was the introduction of Germany's electronic purse, the GeldKarte. Now, 30 million GeldKartes are issued by the GSBO. Founded in 1990, DigiCash is a pioneer in the development of electronic payment systems that provide security and privacy. Available for open and closed systems and network use, DigiCash's products are based on patented developments in public key cryptography devised by Dr. David Chaum. DigiCash's first product was a road-toll system developed for the Dutch government. DigiCash's cryptographic technology has also enabled the company to develop smart cards for a diverse range of applications including CAFE, the smart card-based payment system operated by the Headquarters of the European Union in Brussels. The CAFE project was funded by the European Union, just one of the several European Union technology projects with which DigiCash has been involved, designing cards that feature pre-paid cash replacement functions, loyalty schemes and access control. SET (Secure Electronic Transaction) a global standard secures credit card payments over the Internet by utilizing digital certificates, which validate the genuine identities of both cardholders and merchants participating in transactions via the Web, combined with the encryption of individual card numbers. A reprint of David Birch's (Hyperion) recent Financial Times Virtual Finance Report article on the concepts behind shopping protocols such as OTP is available from David in electronic form (PDF). E-mail directly to daveb@hyperion.co.uk to request a copy. # # # Copyright 1998, The Open Trading Protocol Consortium --------------------------------------------------------------------------- Chris Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Jan 13 04:25:09 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft References: <199801121452.JAA00397@unready.microstar.com> Message-ID: <34BAEAEC.79411D0E@jclark.com> Some brief comments on the draft: EntityHandler.changeEntity belongs in DocumentHandler: it's giving you information that you need to interpret the information that DocumentHandler provides. It has nothing to do with resolveEntity. I don't think the doctype() method is a good idea. A external DTD subset is just a convenient shorthand for declaring and referencing an external parameter entity. It shouldn't be singled out for special treatment. Knowing the external doctype isn't very useful without knowing anything about the internal subset, since the internal subset can completely change the effect of an external DTD. I think this method should be dropped. If you want to provide DTD information, do it properly with a separate DtdHandler interface. ErrorHandler needs an error method in addition to fatal and warning. It would be better if all methods had a single argument of XmlException. AttributeMap seems way too complicated. I don't think using Enumeration to get all the attributes is a good idea. JDK 1.2 replaces Enumeration by Iterator. The method names in Enumeration are a real disaster in the context of XML: nextElement returns the name of the next attribute! This is not going to be an efficient way to get at all the attributes (which is a common application need). To get at all the attributes, I have first to create an Enumeration (an unnecessary allocation). Then for each attribute name: I have to make two non-final method calls (nextElement and hasMoreElements); I then have a cast (which must be checked) from Object to String; I then have to look the attribute up using getValue. Compare this to the simple interface I suggested: void startElement(String elementName, String[] attributeNames, String[] attributeValues, int nAttributes) Is AttributeMap required to implement clone? I think it probably ought to be. How does AttributeMap deal with implied attributes that are not specified? I think they shouldn't appear at all. Does isEntity() return true for ENTITIES attributes? Does isIdref() return true for IDREFS attributes. AttributeMap.getNotationNameID is not a good name: getEntityNotationName or getEntityNotation would be better. PublicID and SystemID in method names should be PublicId and SystemId (Identifier is one word not two). In XmlException, getLine should be getLineNumber for consistency with java.io.LineNumberInputStream. Is the first column number 0 or 1? (Emacs thinks it's 0, but the first line is cleatly 1.) James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Jan 13 04:45:24 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft References: <199801121452.JAA00397@unready.microstar.com> <34BAEAEC.79411D0E@jclark.com> Message-ID: <34BA345E.CA87F178@infinet.com> James Clark wrote: > I don't think the doctype() method is a good idea. A external DTD > subset is just a convenient shorthand for declaring and referencing an > external parameter entity. It shouldn't be singled out for special > treatment. Knowing the external doctype isn't very useful without > knowing anything about the internal subset, since the internal subset > can completely change the effect of an external DTD. I think this > method should be dropped. If you want to provide DTD information, do it > properly with a separate DtdHandler interface. > I was sorta confused by this method as well. I think that a DTD handler interface would be nice, since it is a totally separate beast from the document itself. In this case, you might have an internal as well as external DTD handler interface so that the appropriate overrides can be made by the internal DTD information after the external DTD is handled. > ErrorHandler needs an error method in addition to fatal and warning. It > would be better if all methods had a single argument of XmlException. > For callbacks I am not so sure that you need a new Exception to encapsulate data that could be passed as arguments. > AttributeMap seems way too complicated. > > I don't think using Enumeration to get all the attributes is a good > idea. JDK 1.2 replaces Enumeration by Iterator. The method names in > Enumeration are a real disaster in the context of XML: nextElement > returns the name of the next attribute! This is not going to be an > efficient way to get at all the attributes (which is a common > application need). To get at all the attributes, I have first to create > an Enumeration (an unnecessary allocation). Then for each attribute > name: I have to make two non-final method calls (nextElement and > hasMoreElements); I then have a cast (which must be checked) from Object > to String; I then have to look the attribute up using getValue. Compare > this to the simple interface I suggested: > > void startElement(String elementName, String[] attributeNames, String[] > attributeValues, int nAttributes) > I think it would be better to have an array of Attribute objects. public class Attribute { public String name; public String value; } You could make these private and provide accessor methods. Also, you could have a set of query methods similiar to what is currently in AttributeMap. > Is AttributeMap required to implement clone? I think it probably ought > to be. > > I am not sure why it is necessary to be clonable. > In XmlException, getLine should be getLineNumber for consistency with > java.io.LineNumberInputStream. > Good idea... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From grk at arlut.utexas.edu Tue Jan 13 07:50:46 1998 From: grk at arlut.utexas.edu (Glenn R Kronschnabl) Date: Mon Jun 7 16:59:52 2004 Subject: nested lists, xml example dtd? anyone? Message-ID: <34BB1CE9.FBFDDCCD@arlut.utexas.edu> Can someone pls e-mail me an example XML dtd that has nested lists? I have a simple dtd that I am trying to use, but I keep getting an extra newline (or 2) after a (nested) list using jade (rtf). I would like something as simple as: item1 item2 item1 item2 item3 This appears in rtf (but also shows up as extra paragraphs in the fot and in nsgmls) as: * item1 * item2 * item1 * item2 * item3 when I put this thru jade without a DTD, I get a whole lotta extra space and newlines. With docbook3 and itemizedlists, there is no extra whitespace. So, I assume that a proper DTD is the way to slurp up the unwanted whitespace. Correct? Glenn grk@arlut.utexas.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Tue Jan 13 08:58:00 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft Message-ID: <199801130857.IAA20575@GPO.iol.ie> > > >James Clark wrote: > >> I don't think the doctype() method is a good idea. A external DTD >> subset is just a convenient shorthand for declaring and referencing an >> external parameter entity. It shouldn't be singled out for special >> treatment. Knowing the external doctype isn't very useful without >> knowing anything about the internal subset, since the internal subset >> can completely change the effect of an external DTD. I think this >> method should be dropped. If you want to provide DTD information, do it >> properly with a separate DtdHandler interface. >> [Tyler Baker] > >I was sorta confused by this method as well. Isn't there a big issue looming here? How will software agents determine the "type" of an XML document. I am aware of at least one example of a company planning to use the Message-ID: In message <34BB1CE9.FBFDDCCD@arlut.utexas.edu>, Glenn R Kronschnabl writes >Can someone pls e-mail me an example XML >dtd that has nested lists? I have a simple dtd >that I am trying to use, but I keep getting an extra >newline (or 2) after a (nested) list using jade (rtf). I would like >something as simple as: > > > item1 > item2 > > > item1 > item2 > > > item3 > > >This appears in rtf (but also shows up as extra paragraphs >in the fot and in nsgmls) as: > >* item1 >* item2 > * item1 > * item2 > >* item3 > If Jade is outputting extra paragraphs, it is because you have asked it to, not because of spaces or newlines in the document itself. I would guess that the extra 'newlines' (paragraphs) are probably appearing because you have within . If you simply put: (element item (make paragraph ... (process-children))) then each will produce a paragraph flow object, and you are effectively asking Jade to produce nested paragraphs. Of course, RTF, having a pretty flat model for the document, can't do that, and probably generates an extra paragraph instead. >when I put this thru jade without a DTD, I get a whole >lotta extra space and newlines. With docbook3 and >itemizedlists, there is no extra whitespace. So, I assume >that a proper DTD is the way to slurp up the unwanted >whitespace. Correct? If you use (process-children-trim) rather than (process-children), a DSSSL engine will chuck out all leading and trailing space from each element - this might allow you need to deal with your well-formed input 'as is', rather than having to write a DTD. Do you know there is a DSSSL list (dssslist@mulberrytech.com)? Hope this helps, Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Jan 13 10:47:57 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:52 2004 Subject: Announcement: SAX 1998-01-12 Draft References: <199801121452.JAA00397@unready.microstar.com> <34BAEAEC.79411D0E@jclark.com> <34BA345E.CA87F178@infinet.com> Message-ID: <34BB44A6.74CCAFA4@jclark.com> Tyler Baker wrote: > > ErrorHandler needs an error method in addition to fatal and warning. It > > would be better if all methods had a single argument of XmlException. > > > > For callbacks I am not so sure that you need a new Exception to encapsulate data > that could be passed as arguments. We don't need a new one: we have one already. The advantage of passing an object is that it's easier for parsers to provide richer error information (by subclassing XmlException). > > Is AttributeMap required to implement clone? I think it probably ought > > to be. > I am not sure why it is necessary to be clonable. Because the spec says: *

This map will be valid only during the invocation of the * startElement callback: if you need to use attribute * information elsewhere, you will need to make your own copies.

If it doesn't implement cloneable, how do you make your own copy? James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Tue Jan 13 11:25:31 1998 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 16:59:52 2004 Subject: SAX: Java Package org.xml.sax In-Reply-To: David Megginson's message of "Wed, 7 Jan 1998 06:56:24 -0500" References: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> <199801070348.TAA07093@boethius.eng.sun.com> <199801071156.GAA00346@unready.microstar.com> Message-ID: David> David Megginson > In article <199801071156.GAA00346@unready.microstar.com>, David > wrote: David> Although in fact neither xml.org nor xml.com currently corresponds David> to any actual company or organisation, and both would have been David> appropriate choices, I think that on balance an *.org domain gives David> a greater _appearance_ of neutrality than a *.com domain, ... Agreed. David> ... I propose that the Java implementation of SAX use the David> package "org.xml.sax": Minor (very minor) nitpick: IINM, the Java language specification recommends that the first (most significant) component of the domain name be written with uppercase letters, and the remainder with lowercase ones[1]. So the package name would be "ORG.xml.sax". [1] DNS names are written in (a subset of) US-ASCII, and are case-insensitive. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 13 14:06:14 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:52 2004 Subject: SAX 1998-01-12 Comments and Feedback Message-ID: <199801131405.JAA00379@unready.microstar.com> Dear colleagues: Thank you to everyone who has posted comments and general feedback on SAX so far. I can occasionally respond to one directly, but I am saving _all_ of responses so that I can summarize in a few weeks to allow discussion before trying a new draft. Until then, I will be very busy with Microstar work and with an extremely tight production deadline at my publisher, not to mention cleaning up from the ice storms -- please do not take my relative silence over the next few weeks as a waning of interest in the SAX project. In the mean time, I hope that you all have fun playing with what's there now, and that you do not hesitate to recommend changes and improvements based on your experience. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From trevor at lab.com Tue Jan 13 14:29:36 1998 From: trevor at lab.com (Trevor Morris) Date: Mon Jun 7 16:59:52 2004 Subject: SAX: Java Package org.xml.sax In-Reply-To: References: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> <199801070348.TAA07093@boethius.eng.sun.com> <199801071156.GAA00346@unready.microstar.com> Message-ID: <3.0.3.32.19980113092702.00966624@iron.butterfly.net> >David> ... I propose that the Java implementation of SAX use the >David> package "org.xml.sax": > >Minor (very minor) nitpick: IINM, the Java language specification >recommends that the first (most significant) component of the domain >name be written with uppercase letters, and the remainder with >lowercase ones[1]. So the package name would be "ORG.xml.sax". > >[1] DNS names are written in (a subset of) US-ASCII, and are > case-insensitive. Please don't do this. Almost every package I've seen uses lowercase for com, org, etc. Below is support taken from the 0.5 release of the Swing package from Sun for using lowercase for com, org, etc. -NOTE: There have been questions about the use of the letters "com" in package -names instead of "COM", as defined in the Java Language Specification. To -resolve these questions, the JavaSoft division of Sun Microsystems Inc. has -determined that the JLS will be updated to recommend the use of "com", and that -is why this proposal uses it. Starting with JDK 1.2, non-core packages from Sun -Microsystems will be prefaced with "com.sun" . (The use of the word "java" in -"com.sun.java.swing" does not follow Sun's new conventions, but this proposal -recommends that it be used to show the linkage between the two packages.) Later, Trevor xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 13 15:57:33 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:52 2004 Subject: SAX-C Message-ID: <3.0.1.16.19980113160441.2a7f46b2@pop3.demon.co.uk> I am forwarding this on behalf of Stefan Wagner because he has e-mail problems. From a very quick glance this looks very exciting Stefan - many thanks. [... mail stuff deleted ...] >Date: Tue, 13 Jan 1998 00:25:58 +0100 >To: xml-dev@ic.ac.uk >From: Stefan Wagner >Subject: SAX-C > >---------------------------------------------------------------- > >Today, David Megginson (dmeggins@microstar.com) with the help of many >others announced the first draft for SAX for Java. >Unfortunately Java is currently no option for me, only C. >To speed up the implementation of a common basic event based API for >parsers written in C (like rxp or xmltok) I have converted the Java >interfaces to C. I have tried to change the names as little as necessary. >Still there will be things which should get changed. >Some candidates for a change: >- replacing sax_ with xml_ >- result type of getAttributeName() > >Anyone adapting a C parser or writiing a driver (instead of writing good >documentation for the parser API)? > >Stefan Wagner > >(st.wagner@ieee.org) > > >(Read the Java draft for an explanation of the functions) > >/* --------------------------------------------------------------------- */ > >/* > SAX-C Simple API for XML - C Language Interface > > Proposal 1998-01-12 >*/ > > > >/* --------------------------------------------------------------------- */ >/* basic types */ >/* --------------------------------------------------------------------- */ > >#define false 0 >#define true 1 >typedef int bool; > >typedef char * String; > >/* --------------------------------------------------------------------- */ >/* Attribut Map >/* --------------------------------------------------------------------- */ > >/* > sax_AttributeMap_struct gets defined in the parser source code. > The definition is not needed by the application >*/ > >typedef struct sax_AttributeMap_struct *sax_AttributeMap; > > >/* --------------------------------------------------------------------- */ >/* typedefs for all the handlers */ >/* --------------------------------------------------------------------- */ > >typedef void ( sax_EntityHandler ) ( String systemID ); > >/* --------------------------------------------------------------------- */ > >typedef struct >{ > void ( *startDocumentHandler ) ( void ); > void ( *endDocumentHandler ) ( void ); > void ( *doctypeHandler ) ( String name, String publicID, String systemID ); > void ( *startElementHandler ) ( String name, sax_AttributeMap attributes ); > void ( *endElementHandler ) ( String name ); > void ( *charactersHandler ) ( char ch[], int start, int length ); > void ( *ignorableHandler ) ( char ch[], int start, int length ); > void ( *processingInstruction ) ( String target, String remainder ); >} sax_DocumentHandler; > >/* --------------------------------------------------------------------- */ > >typedef struct >{ > void ( *warningHandler ) ( String message, String systemID, int line, >int column ); > void ( *fatalHandler ) ( String message, String systemID, int line, >int column ); >} sax_ErrorHandler; > > >/* --------------------------------------------------------------------- */ >/* Parser Interface */ >/* --------------------------------------------------------------------- */ > >void sax_new_parser ( void ); >void sax_delete_parser ( void ); > >/* Default implementation will be used if never called or if called with >handler = NULL */ >/* copies function pointers to internal table */ >void sax_setEntityHandler ( sax_EntityHandler handler ); >void sax_setDocumentHandler ( sax_DocumentHandler handler ); >void sax_setErrorHandler ( sax_ErrorHandler handler ); > >void sax_parse ( String publicID, String systemID ); > > >/* --------------------------------------------------------------------- */ >/* EntityHandler Callbacks */ >/* --------------------------------------------------------------------- */ > >void sax_changeEntity ( String systemID ); > > >/* --------------------------------------------------------------------- */ >/* DocumentHandler Callbacks */ >/* --------------------------------------------------------------------- */ > >void sax_startDocument ( void ); >void sax_endDocument ( void ); > >void sax_doctype ( String name, String publicID, String >systemID ); > >void sax_startElement ( String name, sax_AttributeMap >attributes ); >void sax_endElement ( String name ); > >void sax_characters ( char ch[], int start, int length ); >void sax_ignorable ( char ch[], int start, int length ); >void sax_processingInstruction ( String target, String remainder ); > > >/* --------------------------------------------------------------------- */ >/* ErrorHandler Callbacks */ >/* --------------------------------------------------------------------- */ > >void sax_warning ( String message, String systemID, int >line, int column ); >void sax_fatal ( String message, String systemID, int >line, int column ); > > >/* --------------------------------------------------------------------- */ >/* AttributeMap Interface */ >/* --------------------------------------------------------------------- */ > >/* returns an array of String Pointers and the number of Attribut names */ >void sax_getAttributeNames ( sax_AttributeMap map, int *count, >String *atts ); > >/* a replacement for Enumeration */ >String sax_firstAttributeName ( sax_AttributeMap map ); >String sax_nextAttributeName ( sax_AttributeMap map ); >bool sax_hasMoreAttributeNames ( sax_AttributeMap map ); > >String sax_getValue ( sax_AttributeMap map, String >attributeName ); > >bool sax_isEntity ( sax_AttributeMap map, String >attributeName ); >bool sax_isNotation ( sax_AttributeMap map, String >attributeName ); >bool sax_isId ( sax_AttributeMap map, String >attributeName ); >bool sax_isIdref ( sax_AttributeMap map, String >attributeName ); > >String sax_getEntityPublicID ( sax_AttributeMap map, String >attributeName ); >String sax_getEntitySystemID ( sax_AttributeMap map, String >attributeName ); > >String sax_getNotationNameID ( sax_AttributeMap map, String >attributeName ); >String sax_getNotationPublicID ( sax_AttributeMap map, String >attributeName ); >String sax_getNotationSystemID ( sax_AttributeMap map, String >attributeName ); > >/* --------------------------------------------------------------------- */ > > > > > > > > > >--------------------------------------------------------------------------- --- > Stefan Wagner > Internet: h8625330@obelix.wu-wien.ac.at st.wagner@ieee.org > Fax: +43-1-607 71 57 >--------------------------------------------------------------------------- --^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The real poster :-) - > > P. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 13 16:11:02 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:52 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) In-Reply-To: <199801130857.IAA20575@GPO.iol.ie> Message-ID: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> At 09:26 13/01/98 +0000, Sean Mc Grath wrote: [... contribs from James Clark and Tyler Baker deleted ...] >Isn't there a big issue looming here? How will software agents determine the >"type" >of an XML document. I am aware of at least one example of a company planning >to use the >I was one of those who voted for this but it is clear from what James is saying >that it is plain wrong and misleading to single it out. > >The problem of typing though, remains doesn't it? Yes. I have replied at some length, because this is a difficult issues and has been debated many times before. My message is meant to describe *what the present position is*. NOT *what it would be nice is it was*. I do NOT think debate on the latter is appropriate on XML-DEV. We are back to Lewis Carroll: what is the type of the document and what is the name of the type of the document, and what is the reference to the type of the document... etc. My current reading is: FOO: The FOO simply means that the root of the document is a single FOO element. The only reason things it can be used for are: - telling you what is in the document (i.e. you might want to keep on reading if the document root was POEM). - telling that parser that if the document does NOT have a root element of type FOO it can throw a Draconian error and not do any more work. IMO I can live without this :-) PUBLIC: The pubID says that the FOO organisation has produced a DTD identified by this string (presumably this is V1.23 of a DTD, but it doesn't have to be.) This is useful to me if: - I have heard of the FOO organisation - know where to find them - they provide a document whose identifier (NOT reference or address) is the pubId string - I know how to find this document. This is used in certain domains (e.g. publishing, where FPIs are known and used.) However AFAIK there is no mechanism for locating them on the WWW, no simple means of registering, no one paid to maintain a registry. Without this their use in XML may be minimal. SYSTEM: The "foo.dtd" says that the external subset of the DTD can be found in a named file (more generally a URL). This URL may be absolute or relative to the current document. The NAME of the URL (i.e. the address) is a very poor way of *identifying* the TYPE of the document, since it is the contents of the URL that matter. It is probably no secret that this debate has exercised the WG at length. In conclusion (IMO) the DOCTYPE statement really only serves to identify the address of the external subset. It is equivalent to: %foo; ]> How do we determine the TYPE of a document? There is no good mechanism. The following could be developed: - convince the world to use FPIs and fund a registry (a la domain name registration) - create and register MIME types and attach them to documents - develop an XML-specific mechanism to be located *inside* XML schema files I suspect that offering the DOCTYPE in SAX is of limited value and more trouble than it is worth. [The entity of the external subset can be obtained by other means if required.] P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From agreene at bitstream.com Tue Jan 13 16:36:43 1998 From: agreene at bitstream.com (Andrew Greene) Date: Mon Jun 7 16:59:52 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) In-Reply-To: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> (message from Peter Murray-Rust on Tue, 13 Jan 1998 16:39:28) Message-ID: <19980113162822.AAA4591@AGREENE-PC.bitstream.com> Date: Tue, 13 Jan 1998 16:39:28 From: Peter Murray-Rust [...] My current reading is: FOO: The FOO simply means that the root of the document is a single FOO element. The only reason things it can be used for are: - telling you what is in the document (i.e. you might want to keep on reading if the document root was POEM). - telling that parser that if the document does NOT have a root element of type FOO it can throw a Draconian error and not do any more work. IMO I can live without this :-) You left out another use of the FOO in that declaration, which is that it identfies which element type in the DTD is "topmost". I have a collection of pages which are a catalog of sheet music. The various pages have different high-level structure but the low-level "paragraphs" are the same. I can have a single DTD that describes each of the following: * ComposerPage (contains a biography and a list of compositions) * PublisherPage (contains contact information, shipping costs, and a list (sorted by title) of compositions) * NationalityPage (contains a description of national musical history and a list (sorted by composer) of compositions) And so each of these XML files can start off with the appropriate document type identifier: I *can* live without this, but I'd rather not have to! :-) - Andrew Greene xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From howardk at paradigmdev.com Tue Jan 13 17:24:39 1998 From: howardk at paradigmdev.com (Howard Katz) Date: Mon Jun 7 16:59:52 2004 Subject: SAX: Java Package org.xml.sax Message-ID: <57B675B21506D1118BAB0060081C295D23A0B4@vserver.paradigm.com> I believe they've withdrawn that recommendation. I don't have any pointers for you (as it were), but Graham Hamilton, Larry Cable, or someone of that stature made that announcement on the beans list several months ago. Howard > -----Original Message----- > From: Toby Speight [SMTP:tms@ansa.co.uk] > Sent: Tuesday, January 13, 1998 2:30 AM > To: XML developers' list > Subject: Re: SAX: Java Package org.xml.sax > > David> David Megginson > > > In article <199801071156.GAA00346@unready.microstar.com>, David > > wrote: > > David> Although in fact neither xml.org nor xml.com currently > corresponds > David> to any actual company or organisation, and both would have been > David> appropriate choices, I think that on balance an *.org domain > gives > David> a greater _appearance_ of neutrality than a *.com domain, ... > > Agreed. > > David> ... I propose that the Java implementation of SAX use the > David> package "org.xml.sax": > > Minor (very minor) nitpick: IINM, the Java language specification > recommends that the first (most significant) component of the domain > name be written with uppercase letters, and the remainder with > lowercase ones[1]. So the package name would be "ORG.xml.sax". > > [1] DNS names are written in (a subset of) US-ASCII, and are > case-insensitive. > > -- > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Tue Jan 13 18:01:14 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:59:52 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> Message-ID: <34BBAB34.1E2387E0@isogen.com> Peter Murray-Rust wrote: > In conclusion (IMO) the DOCTYPE statement really only serves to > identify > the address of the external subset. It is equivalent to: > > > %foo; > ]> Exactly correct. The DOCTYPE declaration tells you *nothing* about the abstract type of the document (that is, the general class of documents of which the author intended it to be an instance. > How do we determine the TYPE of a document? There is no good > mechanism. Not true. All that is necessary is to provide some way to point to a separate definition of the type. The SGML architecture mechanism, defined in ISO/IEC 10744:1997 and implemented in the SP parsers (as well as in purpose-built code) provides just such a mechanism. In December, James and I submitted for WG4 approval an enhancement to the formal mechanism that lets it be used with XML documents. See "http://www.ornl.gov/sgml/wg8/document/1957.htm". The idea is a simple one: you use a PI to associate a local name for the "type" and then use a URL or public identifier to point to the documentation and the DTD that defines the type. For example, ISOGEN has defined for its own use a base architecture from which a variety of specific document types can be derived. I can invoke the use of this architecture like so: Foo is now clearly a kind of ISOBase paragraph By default, the architecture ("type") name is used as the name of the attribute you use to map local elements to element types in the architecture (which types you can determine by looking at the architectural DTD). Note that the presence or absence of a DOCTYPE declaration is irrelevant--all the information you need to interpret the Foo element as an ISOBase paragraph is in the instance. The only think a DOCTYPE declaration would add would be the convenience of setting a default value for the ISOBase attribute. Note also that it requires no parser-level code to interpret and support the mapping because it's using normal XML syntax: PIs and attributes. It also doesn't require anything like the colonized names because the name mapping is done through an attribute, which has the advantage that the same element can be mapped to different architectures at the same time. For example, I might want to also indicate that the Foo element corresponds to something in the RDF spec: Foo is now clearly a kind of ISOBase paragraph When you're doing ISOBase-related processing, you ignore the RDF mapping and when you're doing RDF-related processing you ignore the ISOBase mapping. Or, you can consider both at once, it's up to your processor. The document can be validated against either of the architectural DTDs by using a tool like SP, which has that facility built in, or by explicitly generating the document that reflects the mapping and then validating it against the architectural DTD. For example, the ISOBase "architectural instance" of the above is: Foo is now clearly a kind of ISOBase paragraph That's all there is to it. The idea that DOCTYPE declarations tell you something useful is one of the top five Big Lies of SGML. For more on the subject of architectures, see "http://www.isogen.com/papers/archintro.html", which goes into more detail about using architectures within an XML context. If anyone would like to see real code that does architecture-based processing, I would be happy to provide it in any of the languages in which I've done it (Perl, Rexx, DSSSL, ACL, VisualBasic--sorry, no Java, only because I haven't had a need to do Java programming yet--note the preponderance of *interpreted* languages in this list :-). Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msuzio at ford.com Tue Jan 13 18:18:31 1998 From: msuzio at ford.com (Michael J. Suzio) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <34BBAB34.1E2387E0@isogen.com> Message-ID: <199801131817.AA21723@mailfw2.ford.com> W. Eliot Kimber wrote: > That's all there is to it. The idea that DOCTYPE declarations > tell you something useful is one of the top five Big Lies of SGML. Can you enlighten us on the other four? ;-) > If anyone would like to see real code that does architecture-based > processing, I would be happy to provide it in any of the languages in > which I've done it (Perl, Rexx, DSSSL, ACL, VisualBasic-- > sorry, no Java, only because I haven't had a need to do > Java programming yet--note the preponderance of > *interpreted* languages in this list :-). I think posting pointers to Perl and DSSSL examples would be helpful to many of us. I suppose VB and REX are interesting, too (just not to me ). So, is this the proposed namespace alternative (architectural forms?). I've been trying to get up to speed on namespace issues, just haven't gotten that far yet. -- Michael J. Suzio Web Technical Standards, WWW & Internet Applications (313) 24-88120 msuzio@eccms1.dearborn.ford.com / msuzio@ford.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 14 01:05:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) In-Reply-To: <34BBAB34.1E2387E0@isogen.com> References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> At 09:58 13/01/98 -0800, W. Eliot Kimber wrote: > > [PeterMR] > How do we determine the TYPE of a document? There is no good >> mechanism. > >Not true. All that is necessary is to provide some way to point to a >separate definition of the type. The SGML architecture mechanism, >defined in ISO/IEC 10744:1997 and implemented in the SP parsers (as well >as in purpose-built code) provides just such a mechanism. In December, >James and I submitted for WG4 approval an enhancement to the formal >mechanism that lets it be used with XML documents. See >"http://www.ornl.gov/sgml/wg8/document/1957.htm". > >The idea is a simple one: you use a PI to associate a local name for the >"type" and then use a URL or public identifier to point to the >documentation and the DTD that defines the type. Thanks. I wasn't aware of this. We need something like it. It does, of course, rely on building a significant registry for FPIs. As far as I remember from previous discussions very FPIs are registered at present, and the mechanism is not widely known. If this mechanism is to become popular for XML - before the WWW gets swamped with untyped documents without meaningful FPIs - there needs to be a lot of effort to publicise and implement it. > >For example, ISOGEN has defined for its own use a base architecture from >which a variety of specific document types can be derived. I can invoke >the use of this architecture like so: > > > name="ISOBase" > public-id="+//IDN isogen.com//NOTATION ISOGEN Base Architecture//EN" > dtd-system-id="http://www.isogen.com/ISOBase/isobase.mdt" >?> As I understand it, these PIs are *permitted* in XML (any PI is permitted) but they are given no special importance and implementers are not required to support them. So XML - as it stands today - has no mechanism for requiring this to be implemented or interpreted. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From antony at n-space.com.au Wed Jan 14 01:41:06 1998 From: antony at n-space.com.au (Antony Blakey) Date: Mon Jun 7 16:59:53 2004 Subject: SAX: Java Package org.xml.sax References: <003f01bd1a45$cd2a02b0$2ee044c6@donpark> <199801070348.TAA07093@boethius.eng.sun.com> <199801071156.GAA00346@unready.microstar.com> Message-ID: <34BC1774.8A7A485C@n-space.com.au> Toby Speight wrote: > Minor (very minor) nitpick: IINM, the Java language specification > recommends that the first (most significant) component of the domain > name be written with uppercase letters, and the remainder with > lowercase ones[1]. So the package name would be "ORG.xml.sax". Sun have deprecated this behaviour. They now recommend lower case. One problem is that when creating archives on a case-insensitive system such as Windows, it's very easy to end up with lower case in the archive, which is NOT case insensitive when accessed by the classloader. +----------------------------------+ | Antony Blakey | | N-Space Pty Ltd | | Java - CORBA - SGML - XML | | mailto:antony@n-space.com.au | | http://www.n-space.com.au | +----------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Wed Jan 14 01:59:01 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> Message-ID: <34BC1B2B.E26AC3B1@isogen.com> Peter Murray-Rust wrote: > Thanks. I wasn't aware of this. We need something like it. It does, > of > course, rely on building a significant registry for FPIs. As far as I > remember from previous discussions very FPIs are registered at > present, and > the mechanism is not widely known. If this mechanism is to become > popular > for XML - before the WWW gets swamped with untyped documents without > meaningful FPIs - there needs to be a lot of effort to publicise and > implement it. I think we need to add an architecture system-id attribute, which would let you provide a URL for the architecture definition document. I don't think that would be a controversial change (but you never know). Of course, you'd really want to use a URN for that, which is what a public ID is (and can be syntactically if you don't require formal public IDs). It's not really a question of registering FPIs, it's a question of making it clear what the abstract type is *to a human observer*. From a code perspective, either all you care about is the architectural declarations (so you can validate with respect to them), or you have the FPI hard-coded into a table of architectures that you understand. The most you have to do is implement the normalization rules for minimum literals (i.e., squeeze out non-significant white space). > > As I understand it, these PIs are *permitted* in XML (any PI is > permitted) > but they are given no special importance and implementers are not > required > to support them. So XML - as it stands today - has no mechanism for > requiring this to be implemented or interpreted. It could by using this mechanism as the basis for solving the name-space proposal. Note that XML doesn't have much in the way of syntax choices--it can't add a new declaration unless SGML does as well (which seems likely as part of the revision, but that's a ways off still). You can't use element attributes because that imposes on the document's private name space. So that only leaves notations and PIs. Notations are out because XML doesn't provide data attributes (which you need to do the configuration of the architecture use), so that leaves PIs. Thus, whatever you come up with will look very much like the PI defined in N1957 (the proposed HyTime amendment). Besides, talking about "required to support" is meaningless because it's not a syntactic issue--it's a semantic processing issue and you can never require semantic processing in a syntactic spec. You can require it in a semantic spec, such as HyTime or DSSSL or XML Link, but not in XML Lang. Or said another way, even if you make the true document type painfully clear to me, I am still free to ignore that information during processing. If the facility has value, systems will support it. Note also that in the simple case, where the document could use the architectural DTD as its own if it cared to, the mapping can be completely automatic (by the rules of default architectural mapping). In other words, if my architecture defines an element called "foo" and my document, derived from that architecture, has an element called "foo", then my foo is taken to be the architectural foo unless you tell me otherwise, without the need to explicitly map it. Thus, any document with an explicit DTD can use that same DTD as an architecture without changing the instance. In other words, I can go from this: To this: With exactly the same processing effect, except that no validating XML processor is *required* to process the declarations (but it can if it wants to, after XML Lang-required validation is done). This solves the problem of wanting to limit declarations to external subsets: you make the declarations architectural DTDs. The authors of individual documents can't modify the architectural DTD and any local declarations don't affect it (only the local mapping to the architecture), so architecture-based processors can be confident in only worrying only about the element types and attributes defined in the architectural DTD--they simply ignore anything in the base document that isn't architectural (that is, that isn't mapped to something in the architecture). This solves the problem that RDF is seeing, where they want to be able to disallow declarations in RDF documents but still have some formal specifications somewhere, without imposing the burdent of declaration awareness on all RDF-aware processors. With this approach they can do that. Here's one more trick. Say in your architecture you define your element type and attribute names using colons. For example, consider this simple architectural DTD: And this one: And this document derived from it: This is a kimber paragraph This is not a kimber paragraph This is a woods paragraph This looks just like all the colonized name proposals, but it's simply taking advantage of the automatic name mapping of architectures: the name "kimber:para" matches the name "kimber:para" in the kimber.dtd. Of course, the down side that if you want to map an element to two forms, you have some redundancy: This is both a kimber and woods para But you can't have everything--at least you can *do* multiple mappings. You could also provide two different versions of the architectural DTD: one with colons and one without. You'd use the colonized one for the architecture you use the most and the non-colonized version for the others. The architecture definition document would explain the correlation between colonized and non-colonized versions of architectural elements for the benefit of implementors. Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 14 08:39:00 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) In-Reply-To: <34BC1B2B.E26AC3B1@isogen.com> References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> At 17:55 13/01/98 -0800, W. Eliot Kimber wrote: [... lots of valuable stuff about architectural forms ...] I won't reply in detail, since we might be in danger of getting off-topic. I do not know what the WG's thinking is on AFs, but I should not be surprised to see them formally introduced into XML at some stage. At present JUMBO uses namespaces to tackle the colonised element names - the result is probably fairly similar in practice. If the WG suggests that AFs should be part of the XML effort I'll have to learn how to hack a processor :-). Anyway I am glad I can get rid of the DOCTYPE. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Jan 14 12:37:39 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:53 2004 Subject: Announcement: SAX 1998-01-12 Draft Message-ID: <01bd20e9$3fee69e0$1e09e391@mhklaptop.bra01.icl.co.uk> James Clark wrote: >AttributeMap seems way too complicated. > >... To get at all the attributes, I have first to create >an Enumeration (an unnecessary allocation). Then for each attribute >name: I have to make two non-final method calls (nextElement and >hasMoreElements); I then have a cast (which must be checked) from Object >to String; I then have to look the attribute up using getValue. Compare >this to the simple interface I suggested: > >void startElement(String elementName, String[] attributeNames, String[] >attributeValues, int nAttributes) > Counter-argument: in my first SAX application, I knew what attributes to expect, so I was able to write: public void startElement (String name, AttributeMap atts) { String id = atts.getValue("ID"); String ref = atts.getValue("REF"); .... } This would be MUCH more clumsy with James' proposed interface! Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Wed Jan 14 13:53:08 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:53 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <01bd20f3$4d749670$a00b0ac0@pharcyde.poetsoftware.xo.com> James Clark wrote: >AttributeMap seems way too complicated. > >I don't think using Enumeration to get all the attributes is a good >idea. JDK 1.2 replaces Enumeration by Iterator. The method names in >Enumeration are a real disaster in the context of XML: nextElement >returns the name of the next attribute! This is not going to be an >efficient way to get at all the attributes (which is a common >application need). To get at all the attributes, I have first to create >an Enumeration (an unnecessary allocation). Then for each attribute >name: I have to make two non-final method calls (nextElement and >hasMoreElements); I then have a cast (which must be checked) from Object >to String; I then have to look the attribute up using getValue. Compare >this to the simple interface I suggested: > >void startElement(String elementName, String[] attributeNames, String[] >attributeValues, int nAttributes) I agree that the AttributeMap is too complicated. On the other hand, your alternate proposal seems questionable. Passing three parameters to the event handler may be simple, but this eliminates any abstraction, which makes it hard to extend the interface cleanly. Also, this makes iteration easy but finding attributes by name very hard. An AttributeMap interface should be used, but: 1) It should provide a standard iterator interface (this is the only reasonable way to iterate over a map). 2) It should deliver attributes values as strings (the current getValue) only. All of the "is" methods seem *way* out of scope for SAX. We decided we didn't want DOM building capability (or a similar level of functionality), so why is this information necessary? This spec very accurately represents a consensus of the various points discussed on the list, but I don't remember any discussion about getting additional information about the attribute beyond the value. Did I miss something? IMHO, the following perfectly sufficient for the time being: public Iterator getIterator (); public String getValue (String attributeName); We might want to consider making the map from a string to an Attribute object: public Attribute getAttribute (String attributeName); The Attribute interface would contain only: public String getName (); public String getValue (); This would make extensibility easier in moving towards an advanced version of SAX with DOM-building power. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Jan 14 14:42:03 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:53 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <3.0.32.19980114063855.00abd6e0@pop.intergate.bc.ca> At 02:49 PM 14/01/98 +0100, Matthew Gertner wrote: >IMHO, the following perfectly sufficient for the time being: >public Iterator getIterator (); >public String getValue (String attributeName); I agree that this is exactly what is needed. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Jan 14 23:48:26 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> Message-ID: <34BD4E7A.1230@hiwaay.net> For those who might have missed it, the VRML Consortium has approved a working group to look at issues of a binding of VRML, XML, DHTML and DOM. Rob Glidden is heading up the effort which kicked off last week with a meeting in San Jose. Rob is looking for folks to work on an XML DTD for VRML. Some names have been suggested, but if anyone is interested in the effort, Rob Glidden quadramx@quadramix.com There is a Web page describing the effort. Sorry, but I don't have the URL here at home. cheers, len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 15 00:31:11 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:59:53 2004 Subject: DOCTYPE (was Re: Announcement: SAX 1998-01-12 Draft) References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> Message-ID: <34BD582C.E5873BC4@isogen.com> len bullard wrote: > > For those who might have missed it, > the VRML Consortium has approved a > working group to look at issues > of a binding of VRML, XML, DHTML > and DOM. Rob Glidden is heading > up the effort which kicked off > last week with a meeting in San Jose. I've already responded personally to a note from Chris Marrin, who had been informed about my VRML as SGML work (see "http://www.drmacro.com/vrml"). Transliterating VRML nodes into SGML syntax is straight forward (at least for 1.0, don't know about 2.0 'cause I don't know what it looks like). Using XML for VRML is an excellent example of how SGML architectures can be put to use. For example, assume we have a VRML architecture that defines element types corresponding to the node types in VRML, e.g., Cube, Cone, Light, etc. A plain vanilla VRML document might look something like this: 100 100 100 Now say you want to define your own specialized VRML node types. Within an XML context, you can do this with architectures: 1000 1000 1000 I've now "subclassed" cube into the type "big-block" without obscuring the connection back to the VRML-defined types. I still have to do all the definition of my big-block type--I don't get any syntax shortcuts, but you can't have everyting [I realized recently that this is very much like the "implements" feature in VB5--it lets you declare conformance to an "interface" defined elsewhere, but you still have to provide all the parts locally because VB5, like SGML architectures, is not truly object oriented in the purest sense.] With today's tools, all I have to do to make a working world out of this second document is write a Perl script or DSSSL spec or whatever to generate the appropriate VRML from the SGML version of it. It's not very hard because the transformation is so simple. I think that could be pretty useful, especially once VRML browsers let you associate presentation styles with element types. After all, rendering 3-D objects is not fundamentally different in the abstract from rendering 2-D objects, it's still a matter of apply presentation style. So why shouldn't XSL be just as useful for VRML worlds as for 2-D documents? Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Jan 15 02:33:43 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:53 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) References: <01bd20f3$4d749670$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <34BD7378.AD58960F@jclark.com> Matthew Gertner wrote: > > James Clark wrote: > > >AttributeMap seems way too complicated. > > > >I don't think using Enumeration to get all the attributes is a good > >idea. JDK 1.2 replaces Enumeration by Iterator. The method names in > >Enumeration are a real disaster in the context of XML: nextElement > >returns the name of the next attribute! This is not going to be an > >efficient way to get at all the attributes (which is a common > >application need). To get at all the attributes, I have first to create > >an Enumeration (an unnecessary allocation). Then for each attribute > >name: I have to make two non-final method calls (nextElement and > >hasMoreElements); I then have a cast (which must be checked) from Object > >to String; I then have to look the attribute up using getValue. Compare > >this to the simple interface I suggested: > > > >void startElement(String elementName, String[] attributeNames, String[] > >attributeValues, int nAttributes) > > I agree that the AttributeMap is too complicated. On the other hand, your > alternate proposal seems questionable. Passing three parameters to the event > handler may be simple, but this eliminates any abstraction, which makes it > hard to extend the interface cleanly. If SAX is supposed to be abstract and extensible, then it needs a substantial rework. Something like this would be much more extensible: interface DocumentHandler { void startElement(StartElementEvent event); void endElement(EndElementEvent event); void characters(CharactersEvent event); //... } Simplicity was the main design goal of SAX. Why do we get abstraction and extensibility for attributes but for nothing else? > Also, this makes iteration easy but > finding attributes by name very hard. > > An AttributeMap interface should be used, but: > > 1) It should provide a standard iterator interface (this is the only > reasonable way to iterate over a map). This has all the inefficiencies that I listed for Enumeration. Requiring an object to be allocated on each start-tag is really not a good idea (it makes a measurable difference to performance in Java). Something like this: interface AttributeList { int length(); // or maybe size String getValue(int i); // or maybe valueAt String getName(int i); // or maybe nameAt String get(String name); } would be significantly more efficient. At the very least provide an isEmpty() so that I don't have to do the allocation in the common case there are no attributes. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Jan 15 03:19:59 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:54 2004 Subject: VRML and XML References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> Message-ID: <34BD7FFE.27CE@hiwaay.net> W. Eliot Kimber wrote: > I think that could be pretty useful, especially once VRML browsers let > you associate presentation styles with element types. After all, > rendering 3-D objects is not fundamentally different in the abstract > from rendering 2-D objects, it's still a matter of apply presentation > style. So why shouldn't XSL be just as useful for VRML worlds as for > 2-D documents? While it is possible to write a declarative architecture for the static nodes (eg VRML 1.0), it is a very different beast from a real time simulation language. As to associating XSL, how useful is it to associate a style language with a presentation language that already includes all of the presentation information required. VRML 2.0 is a real time simulation language. The event model is very interesting. It includes events, exposed fields, addChildren, etc. as part of the language. The test is to see if XML, DOM, DHTML adds anything useful. If others from the XML community are interested in this project, it will be conducted on an open list. Here is a simple VRML 2.0 file so you can discover what it looks like. Be aware that a VRML instance has a lot of assumed parts for information that the browser must explicitly support but which are not in the instance itself (eg, addChildren). VRML 2.0 was designed for a very efficient syntax. len #VRML V2.0 utf8 WorldInfo { title "Opening Animation" info [ ] } DEF FirstNav NavigationInfo { type ["NONE" ] headlight FALSE } Background{ skyColor 0 0 0 frontUrl["../galaxy/stars.jpg"] backUrl ["../galaxy/stars.jpg"] rightUrl ["../galaxy/stars.jpg"] leftUrl ["../galaxy/stars.jpg"] topUrl ["../galaxy/stars.jpg"] bottomUrl ["../galaxy/stars.jpg"] } DEF ZoomView Transform { translation 0 0 500 rotation 0 0 1 0 children [ DEF StartView Viewpoint{ position 0 0 0 orientation 0 0 1 0 fieldOfView 0.7 description "Travelling" jump TRUE } ] } DEF Title Transform { translation 0 0 130 #0 0 140 scale 1 1 1 children [ # DEF TitleTouch TouchSensor {} Shape{ appearance Appearance{ material Material{ } texture ImageTexture{ url [ "textures/title_1.gif"] repeatS FALSE repeatT FALSE } textureTransform TextureTransform { translation 0.01 0.01 scale 0.97 0.97 } } geometry IndexedFaceSet{ coord Coordinate{ point[ -250 -118 -40, 0 -118 -40, 0 125 -40, -250 125 -40] } #end coord coordIndex[ 0, 1, 2, 3] texCoord TextureCoordinate { point [ -0.01 -0.01, 1.01 -0.01, 1.01 1.01, -0.01 1.01 ] } texCoordIndex [ 0, 1, 2, 3] } #end geometry }, #end shape Shape{ appearance Appearance{ material Material { } texture ImageTexture{ url [ "textures/title_2.gif"] repeatS FALSE repeatT FALSE } textureTransform TextureTransform { translation 0.01 0.01 scale 0.98 0.98 } } geometry IndexedFaceSet{ coord Coordinate{ point [ 0 -118 -40, 250 -118 -40, 250 125 -40, 0 125 -40] } #end coord coordIndex[ 0, 1, 2, 3] texCoord TextureCoordinate { point [ -0.01 -0.01, 1.01 -0.01, 1.01 1.01, -0.01 1.01 ] point [ 0 0, 1 0, 1 1, 0 1 ] } texCoordIndex [ 0, 1, 2, 3] } #end geometry } #end shape ] #end Children } #end Transform DEF Title_PI PositionInterpolator { key [ 0, 0.25, 0.5, 0.75, 1 ] keyValue [ 0 0 130, 0 0 180, 0 0 300, 0 0 100000, 0 0 200000 ] } DEF Corona Transform{ children Billboard { axisOfRotation 0 0 0 children [ DEF FirstLight Transform{ children [ PointLight{ ambientIntensity 1 intensity 0.9 radius 220 } ] rotation 0 0 -1 0 translation 0 0 200 }, DEF AnotherLight Transform{ children [ PointLight{ ambientIntensity 1 intensity 0.4 radius 710 } ] rotation 0 0 -1 0 translation 0 0 700 }, DEF DirectSun Group { children [ DirectionalLight { direction 0 0 1 intensity 0.9 } ] }, Shape{ appearance Appearance{ material Material{ } texture ImageTexture{ url [ "textures/corona_l.gif"] repeatS FALSE repeatT FALSE } } geometry IndexedFaceSet{ creaseAngle 1 coord Coordinate{ point[ -249 -118 -40, 3 -118 -40, 3 128 -40, -249 128 -40] } #end coord coordIndex[ 0, 1, 2, 3] texCoord TextureCoordinate { point [ -0.01 -0.01, 1.01 -0.01, 1.01 1.01, -0.01 1.01 ] } texCoordIndex [ 0, 1, 2, 3] } #end geometry }, #end shape Shape{ appearance Appearance{ material Material{ } texture ImageTexture{ url [ "textures/corona_r.gif"] repeatS FALSE repeatT FALSE } } geometry IndexedFaceSet{ creaseAngle 1 coord Coordinate{ point [ -3 -118 -40, 250 -118 -40, 250 127 -40, -3 127 -40] } #end coord coordIndex[ 0, 1, 2, 3] texCoord TextureCoordinate { point [ -0.01 -0.01, 1.01 -0.01, 1.01 1.01, -0.01 1.01 ] point [ 0 0, 1 0, 1 1, 0 1 ] } texCoordIndex [ 0, 1, 2, 3] } #end geometry } #end shape ] # end billboard children } # end billboard } # end Corona transform DEF TheSpheres Transform { children [ DEF Core Transform { children [ Shape { appearance Appearance{ material Material { diffuseColor 1 1 1 emissiveColor 1 1 1 } } geometry Sphere{ radius 65 } } ] }, DEF Inner Transform{ children [ Shape{ appearance Appearance{ material Material{ diffuseColor 1 1 1 emissiveColor 0.5 0.4 0.2 #0.7 0.6 0.4 ambientIntensity 1 transparency 0.3 shininess 1 } texture ImageTexture{ url [ "textures/sol_1.gif"] } } geometry Sphere{ radius 69.5 } } ] rotation 0 1 0 0 }, DEF Outer Transform{ children [ Shape{ appearance Appearance{ material DEF Shell Material{ diffuseColor 1 1 1 emissiveColor 0.7 0.6 0.4 #0.9 0.8 0.6 ambientIntensity 1 shininess 0 transparency 0.75 } texture ImageTexture{ url [ "textures/sol_2.gif"] } } geometry Sphere{ radius 80 } } ] rotation 0 1 0 0 }, DEF Flares Transform{ children [ Shape{ appearance Appearance{ material DEF FlaresT Material{ diffuseColor 1 1 1 emissiveColor 0.5 0.4 0.25 #0.7 0.6 0.5 ambientIntensity 1 shininess 0.2 transparency 0 } texture ImageTexture{ url [ "textures/sol_3.gif"] } } geometry DEF FlareSphere Sphere{ radius 75 } } ] rotation 0 1 0 0 } ] # end TheSpheres children rotation 0 1 0 0 } DEF Inner_TmS TimeSensor{ cycleInterval 20 enabled TRUE loop TRUE } DEF Inner_OI OrientationInterpolator{ key [ 0, 0.333333, 0.666667, 1] keyValue [ 0 1 0 0, 0 1 0 2.0944, 0 1 0 4.18879, 0 1 0 6.28319] } ROUTE Inner_TmS.fraction_changed TO Inner_OI.set_fraction ROUTE Inner_OI.value_changed TO Inner.set_rotation DEF Flares_TmS TimeSensor { cycleInterval 42 enabled TRUE loop TRUE } DEF FlareExp TimeSensor { cycleInterval 3 enabled TRUE loop TRUE } DEF Flares_OI OrientationInterpolator{ key [ 0, 0.333333, 0.666667, 1] keyValue [ 0 1 0 0, 0 1 0 2.0944, 0 1 0 4.18879, 0 1 0 6.28319] } #DEF Flares_SI ScalarInterpolator { # key [ 0 .5 1 ] # keyValue [ 65 75 65 ] #0 0.7 0 ] # } DEF FlareScale PositionInterpolator { key [ 0 .5 1 ] keyValue [ 0.9 0.8 0.9 0.85 0.8 0.85 0.9 0.8 0.9 ] } ROUTE Flares_TmS.fraction_changed TO Flares_OI.set_fraction ROUTE Flares_OI.value_changed TO Flares.set_rotation #ROUTE FlareExp.fraction_changed TO Flares_SI.set_fraction #ROUTE Flares_SI.value_changed TO FlaresT.transparency ROUTE FlareExp.fraction_changed TO FlareScale.set_fraction ROUTE FlareScale.value_changed TO Flares.set_scale DEF Crash Transform{ rotation 0 1 0 -2.3562 translation 0 0 100000 children [ DEF CrashView Viewpoint{ position 0 400 1000 orientation -1 0 0 0.38051 fieldOfView 0.6 description "The Earth" jump TRUE } #DEF CrashNav NavigationInfo{ # type ["NONE" ] # avatarSize [0.025, 0.025, 0.025] # headlight FALSE #} DEF EandC Transform{ children [ DEF Earth Transform{ children [ Shape{ appearance Appearance{ material Material{ emissiveColor 0.2 0.2 0.2 diffuseColor 1 1 1 shininess 0.390625 transparency 0 } texture ImageTexture{ url [ "textures/earth.jpg"] } } geometry Sphere{ radius 63.78 } } ] scale 1.003 1 1.003 # 1/127 oblateness }, DEF Clouds Transform{ children [ Shape{ appearance Appearance{ material Material{ emissiveColor 0.2 0.2 0.2 diffuseColor 1 1 1 shininess 0.390625 } texture ImageTexture{ url [ "textures/clouds.gif"] } } geometry Sphere{ radius 63.98 # 12 km above sea level } } ] scale 1.04 1.04 1.04 # sorry, added separation needed for VRML browser z-buffer noise. }, DEF Haze Transform{ children [ Shape{ appearance Appearance{ material DEF HazeT Material{ # emissiveColor 0.35 0.2 0.2 diffuseColor 0.35 0.2 0.2 transparency 1 } } geometry Sphere{ radius 64.08 # 12 km above sea level } } ] scale 1.1 1.1 1.1 # sorry, added separation needed for VRML browser z-buffer noise. } ] # end EandC children rotation 0 0 1 -.40927971 # 23.45 degree inclination (23 degrees 27 minutes) scale 2.3 2.3 2.3 }, DEF PlanetLight Transform{ children [ PointLight{ ambientIntensity 0.6 radius 1500 } ] rotation 0 0 -1 0 translation -600 100 1000 }, DEF Asteroid Transform { translation 0 0 0 children [ Shape { appearance Appearance{ material Material{ ambientIntensity 0.05 emissiveColor 0.25 0.15 0.1 diffuseColor 0.8 0.6 0.4 specularColor 0.8 0.6 0.4 shininess 0.5 } texture ImageTexture { url "textures/asteroid.jpg" } } geometry Sphere { radius 10 } } #end Shape ] #end children of transform rotation 0 0 -1 0 scale 1 0.7 0.6 } DEF ExplodeTrans Transform{ # center 0 -20 -180 children Billboard { axisOfRotation 0 0 0 children [ DEF GoTouch TouchSensor{} DEF WhichImage Switch { whichChoice 0 choice [ Shape { geometry DEF largeFace IndexedFaceSet { coord Coordinate { point[ -100 -100 160, 100 -100 160, 100 100 160, -100 100 160] } #end coord coordIndex[ 0, 1, 2, 3] } #end geometry appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl01.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl03.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl05.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl09.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl11.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl13.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl15.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl17.gif" repeatS FALSE repeatT FALSE } } #end appearance }, #end shape Shape { geometry USE largeFace appearance Appearance { material Material{ diffuseColor 0 0 0 transparency 1 } texture ImageTexture { url "expl01.gif" repeatS FALSE repeatT FALSE } } #end appearance } #end shape ] #end choice }, #end Switch ] #end children of Billboard } #end Billboard # translation 0 20 180 } #end ExplodeTrans transform ] #end Crash children } #end Crash Transform DEF BillTime TimeSensor{ loop FALSE enabled FALSE cycleInterval 2 #1.8 startTime 0 stopTime 1 } DEF BillAnimate ScalarInterpolator { key [ 0, 0.5, 1 ] keyValue [ 0, 4, 8 ] } DEF ChangeImage Script { eventIn SFFloat newValue eventOut SFInt32 newImageCount url "javascript: function newValue(valueIn){ newImageCount = Math.round(valueIn); }" } DEF CheckTrigger Script { eventIn SFFloat fromTimer eventIn SFRotation earthRotation field SFBool triggerSet FALSE field SFBool doneOnce FALSE eventOut SFBool trigger eventOut SFBool trigger2 eventOut SFTime startIt # eventOut SFInt32 imageNumber url "javascript: function fromTimer(fractionIn){ if ( doneOnce ) return; else if (fractionIn >= 0.85) { triggerSet = TRUE; doneOnce = TRUE; } } function earthRotation(newRotation, timestamp){ if ( triggerSet == FALSE ) return; else if ( newRotation[3] > 0.4354 ) { if ( newRotation[3] < 1.1355 ) { print ('Rotation: ' + newRotation[3]); trigger = TRUE; trigger2 = TRUE; startIt = timestamp; triggerSet = FALSE; } } }" } ROUTE BillTime.fraction_changed TO BillAnimate.set_fraction ROUTE BillAnimate.value_changed TO ChangeImage.newValue ROUTE ChangeImage.newImageCount TO WhichImage.set_whichChoice #/S/SpinEngine{ object Earth enabled TRUE time 23.9 axis 0.000000 1.000000 0.000000 degrees -360.000000 oscillate FALSE repeat TRUE} #/SI/Start of SpinEngine DEF Earth_TmS TimeSensor{ cycleInterval 23.9 enabled TRUE loop TRUE } DEF Earth_OI OrientationInterpolator{ key [ 0, 0.25 0.5, 0.75, 1] keyValue [ 0 1 0 1.570796327, 0 1 0 3.141592654, 0 1 0 4.71238898, 0 1 0 0, 0 1 0 1.570796327] } ROUTE Earth_TmS.fraction_changed TO Earth_OI.set_fraction ROUTE Earth_OI.value_changed TO Earth.set_rotation #/SX/End of SpinEngine #/S/SpinEngine{ object Clouds enabled TRUE time 14.000000 axis 0.000000 1.000000 0.000000 degrees -360.000000 oscillate FALSE repeat TRUE} #/SI/Start of SpinEngine DEF Clouds_TmS TimeSensor{ cycleInterval 14 enabled TRUE loop TRUE } DEF Clouds_OI OrientationInterpolator{ key [ 0, 0.333333, 0.666667, 1] keyValue [ 0 1 0 0, 0 1 0 2.0944, 0 1 0 4.18879, 0 1 0 6.28319] } ROUTE Clouds_TmS.fraction_changed TO Clouds_OI.set_fraction ROUTE Clouds_OI.value_changed TO Clouds.set_rotation #/SX/End of SpinEngine DEF Falling_TmS TimeSensor { cycleInterval 22 loop FALSE enabled TRUE startTime 0 stopTime 0 } DEF Falling_PI PositionInterpolator { key [ 0, 0.5, 1 ] keyValue [ -30 420 1000, -15 190 500, 0 -40 0 ] } #/S/SpinEngine{ object Asteroid enabled TRUE time 12.000000 axis 1.000000 0.000000 0.000000 degrees -360.000000 oscillate FALSE repeat TRUE} #/SI/Start of SpinEngine DEF Asteroid_TmS TimeSensor{ cycleInterval 7 enabled TRUE loop TRUE } DEF Asteroid_OI OrientationInterpolator{ key [ 0, 0.333333, 0.666667, 1] keyValue [ 1 0 1 0, 1 0 1 2.0944, 1 0 1 4.18879, 1 0 1 6.28319] } ROUTE Asteroid_TmS.fraction_changed TO Asteroid_OI.set_fraction ROUTE Asteroid_OI.value_changed TO Asteroid.set_rotation #/SX/End of SpinEngine ROUTE Falling_TmS.fraction_changed TO Falling_PI.set_fraction ROUTE Falling_PI.value_changed TO Asteroid.set_translation DEF TitleTimer TimeSensor { loop FALSE enabled TRUE cycleInterval 8 startTime 0 stopTime 0 } DEF HazeTime TimeSensor{ cycleInterval 12 enabled FALSE loop FALSE startTime 0 stopTime 0 } DEF HazeDensity Script { eventIn SFFloat fromTimer eventOut SFFloat valueChanged url [ "javascript: function fromTimer(fractionIn){ valueChanged = (1 - fractionIn); }" ] } DEF SetStop Script { eventIn SFTime current eventOut SFTime StopIt url [ "javascript: function current(value){ StopIt = (value + 8); }" ] } DEF SetStart Script { eventIn SFTime current eventOut SFBool startIt eventOut SFTime theTime eventOut SFTime addedTime eventOut SFTime startAudio eventOut SFTime endAudio eventOut SFTime finalOrbit url [ "javascript: function current(value){ startIt = TRUE; theTime = (value + 6); addedTime = (value + 22); startAudio = (value + 2); endAudio = (value + 114); finalOrbit = (value + 62); //was + 58 }" ] } DEF StartFalling Script { eventIn SFBool set_theTime eventOut SFTime theTime_changed url [ "javascript: function set_theTime( value, timestamp ){ if ( value == TRUE ) { theTime_changed = (timestamp + 1); } }" ] } DEF Zoom_TmS TimeSensor { enabled FALSE loop FALSE startTime 0 stopTime 0 cycleInterval 16 } DEF Zoom_PI PositionInterpolator { key [ 0, 0.25, 0.5, 0.75, 1 ] keyValue [ 0 0 1000, 0 0 6000, 0 0 10000, -35 200 20000, -707.10678 400 99292.89322 ] } DEF Zoom_OI OrientationInterpolator { key [ 0, 0.5, 0.65, 0.75, 1 ] keyValue [ 0 1 0 0 -0.078102 -0.97895 -0.18856 0.4 -0.078102 -0.97895 -0.18856 1.1855 -0.078102 -0.97895 -0.18856 2.3711 -0.078102 -0.97895 -0.18856 2.3711 ] } DEF TriggerCrash Script { eventIn SFFloat fromTimer field SFBool doneOnce FALSE eventOut SFBool animationEnable eventOut SFBool viewpointDisable url [ "javascript: function fromTimer(fractionIn){ if (fractionIn >= 0.95) { if (doneOnce == TRUE) return; else { doneOnce = TRUE; animationEnable = TRUE; viewpointDisable = FALSE; } } else {trigger = FALSE;} }" ] } DEF ErasePrompt Script { eventIn SFTime timeIn field MFString name "../blanktxt.html" field MFString param "target=HTMLTEXT" eventOut SFTime startNow url [ "vrmlscript: function timeIn( value, timestamp ){ if ( value > 0 ) { startNow = timestamp; Browser.loadURL ( name, param ); } }" ] } DEF TitleOff Script { eventIn SFFloat fromTimer eventOut SFTime timeIt url [ "javascript: function fromTimer(fractionIn, timestamp){ if (fractionIn >= 0.3) { timeIt = (timestamp); } }" ] } DEF SwitchItOn Script { eventIn SFBool trigger eventOut SFBool animationEnable eventOut SFBool viewpointDisable url [ "javascript: function trigger(value){ if (value == TRUE) { animationEnable = TRUE; viewpointDisable = FALSE; } }" ] } DEF aSound Sound { source DEF TestSound AudioClip { url "scen01a.wav" startTime -1 stopTime 0 loop FALSE } minBack 100000 minFront 100000 maxBack 100000 maxFront 100000 spatialize FALSE intensity 1 location 0 0 70000 } DEF doneHere Script { eventIn SFBool trigger field SFBool doneOnce FALSE field MFString name2 "../l_muse1.html" field MFString param2 "target=HTMLTEXT" # field MFString name3 "../museum.html" # field MFString param3 "target=VRML" url [ "javascript: function trigger( value, timestamp ) { if (value == FALSE) { Browser.loadURL ( name2, param2 ); // Browser.loadURL ( name3, param3 ); } }" ] } # if ( doneOnce == FALSE ) { # doneOnce = TRUE; # return; # } # else { DEF FinalOrbit_TS TimeSensor { cycleInterval 44 loop TRUE enabled FALSE startTime -1 stopTime 0 } DEF FinalOrbit_PI PositionInterpolator { key [ 0 0.25 0.5 0.75 1 ] keyValue [ 0 400 1000 -1000 0 0 0 -400 -1000 1000 0 0 0 400 1000 ] } DEF FinalOrbit_OI OrientationInterpolator { key [ 0 0.25 0.5 0.75 1 ] keyValue [ 1 0 0 -0.38051 0 1 0 -1.5708 # 0 1 0 -3.14159 0 0.98196 -0.18911 3.14159 0 1 0 -4.71239 1 0 0 -0.38051 ] } ROUTE SetStart.startAudio TO TestSound.startTime ROUTE SetStart.endAudio TO TestSound.stopTime ROUTE SetStart.finalOrbit TO FinalOrbit_TS.startTime ROUTE TestSound.isActive TO doneHere.trigger ROUTE SetStart.addedTime TO Zoom_TmS.stopTime ROUTE SetStart.theTime TO Zoom_TmS.startTime ROUTE SetStart.startIt TO Zoom_TmS.enabled ROUTE Zoom_TmS.fraction_changed TO Zoom_PI.set_fraction ROUTE Zoom_PI.value_changed TO ZoomView.set_translation ROUTE Zoom_TmS.fraction_changed TO Zoom_OI.set_fraction ROUTE Zoom_OI.value_changed TO ZoomView.set_rotation ROUTE Zoom_TmS.fraction_changed TO TriggerCrash.fromTimer #ROUTE TriggerCrash.trigger TO SwitchItOn.trigger #ROUTE SwitchItOn.animationEnable TO CrashView.set_bind #ROUTE SwitchItOn.animationEnable TO StartFalling.set_theTime #ROUTE SwitchItOn.viewpointDisable TO Zoom_TmS.enabled ROUTE TriggerCrash.animationEnable TO CrashView.set_bind ROUTE TriggerCrash.animationEnable TO StartFalling.set_theTime ROUTE TriggerCrash.viewpointDisable TO Zoom_TmS.enabled ROUTE Falling_TmS.fraction_changed TO CheckTrigger.fromTimer ROUTE Earth_OI.value_changed TO CheckTrigger.earthRotation ROUTE CheckTrigger.trigger TO BillTime.enabled ROUTE CheckTrigger.trigger2 TO HazeTime.enabled ROUTE CheckTrigger.trigger2 TO FinalOrbit_TS.enabled #ROUTE CheckTrigger.imageNumber TO WhichImage.set_whichChoice #ROUTE Falling_TmS.time TO BillTime.startTime ROUTE CheckTrigger.startIt TO BillTime.startTime ROUTE StartFalling.theTime_changed TO Falling_TmS.startTime ROUTE HazeTime.fraction_changed TO HazeDensity.fromTimer ROUTE HazeDensity.valueChanged TO HazeT.transparency #ROUTE Falling_TmS.time TO HazeTime.startTime #ROUTE Falling_TmS.time TO SetStop.current ROUTE CheckTrigger.startIt TO HazeTime.startTime ROUTE CheckTrigger.startIt TO SetStop.current ROUTE SetStop.StopIt TO HazeTime.stopTime ROUTE FinalOrbit_TS.fraction_changed TO FinalOrbit_PI.set_fraction ROUTE FinalOrbit_TS.fraction_changed TO FinalOrbit_OI.set_fraction ROUTE FinalOrbit_PI.value_changed TO CrashView.position ROUTE FinalOrbit_OI.value_changed TO CrashView.orientation ROUTE TitleTimer.fraction_changed TO Title_PI.set_fraction ROUTE Title_PI.value_changed TO Title.set_translation ROUTE TitleTimer.fraction_changed TO TitleOff.fromTimer ROUTE TitleOff.timeIt TO SetStart.current #ROUTE TitleTouch.touchTime TO TitleTimer.startTime #ROUTE TitleTouch.touchTime TO ErasePrompt.timeIn #DEF removeWait Script { # eventIn SFTime triggerTime # field MFString textURL "../blanktxt.html" # field MFString textFrame "target=HTMLTEXT" # url ["javascript: # function triggerTime( value, timestamp ) { # if ( value > 0 ) { # Browser.loadURL( textURL, textFrame ); # } # }" # ] #} ROUTE ErasePrompt.startNow TO TitleTimer.startTime ROUTE TestSound.duration_changed TO ErasePrompt.timeIn xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 15 04:39:58 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:59:54 2004 Subject: VRML and XML References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> Message-ID: <34BD927E.AF02B3E0@isogen.com> len bullard wrote: > > While it is possible to write a declarative architecture > for the static nodes (eg VRML 1.0), it is a very > different beast from a real time simulation > language. As to associating XSL, how useful is > it to associate a style language with a presentation > language that already includes all of the presentation > information required. > > VRML 2.0 is a real time simulation language. The > event model is very interesting. It includes > events, exposed fields, addChildren, etc. as > part of the language. The test is to see if > XML, DOM, DHTML adds anything useful. Len, You are confusing the specification of the data with the execution of the specification. It's like saying there'd be some difficulty in representing a programming language using XML syntax--there's not. Using a style language is useful for the same reason it is in SGML: I want to apply different styles to the same basic data objects. Asking why it's useful here is like asking why you need styles if you have a font tag and I know we both know the answer to that one. The fact that the presented result happens, in some presentation styles, to be interactive is completely irrelevant to the issue of representing the data using XML. A better question might be: does XLL (or HyTime event schedules or some combination thereof) provide anything useful in representing the relationships among the nodes, which is all an event model does (define relationships or behavior associated with relationships). XML only operates at the document representation syntax level, so it can have nothing to say about the semantics of the data represented. On the other hand, XLL and HyTime (and DSSSL and XSL) operate at the semantic level and therefore may have lots to say about the semantics of the data represented. Cheers, E. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 15 08:39:44 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:54 2004 Subject: LISTRIVIA: Unnecessary traffic In-Reply-To: <34BD7FFE.27CE@hiwaay.net> References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> Message-ID: <3.0.1.16.19980115082657.089f786c@pop3.demon.co.uk> At 21:18 14/01/98 -0600, [a respected member of the *ML community] wrote: A huge VRML file - mainly whitespace - which was nothing to do with XML and for which I and others have to pay out of our own pockets. Also the subject was introduced without change of subject line. :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Thu Jan 15 09:40:23 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <01bd2199$42d685b0$a00b0ac0@pharcyde.poetsoftware.xo.com> James Clark wrote: >If SAX is supposed to be abstract and extensible, then it needs a >substantial rework. Something like this would be much more extensible: > >interface DocumentHandler { > void startElement(StartElementEvent event); > void endElement(EndElementEvent event); > void characters(CharactersEvent event); > //... >} > >Simplicity was the main design goal of SAX. > >Why do we get abstraction and extensibility for attributes but for >nothing else? Abstraction and extensibility are not absolutes. Although simplicity was the main design goal, the need for good abstractions was clearly an ever-present consideration. You yourself argued (quite rightly) for a separate EntityManager interface, and the continuing discussion led to the definition of several other separate interfaces. This is certainly a sacrifice of simplicity for extensibility and very much correct, IMHO. In the case of the AttributeMap, the lack of an elegant way to find an attribute by name is pretty killer, even without considering the implications for extensibility. >> Also, this makes iteration easy but >> finding attributes by name very hard. >> >> An AttributeMap interface should be used, but: >> >> 1) It should provide a standard iterator interface (this is the only >> reasonable way to iterate over a map). > >This has all the inefficiencies that I listed for Enumeration. >Requiring an object to be allocated on each start-tag is really not a >good idea (it makes a measurable difference to performance in Java). > >Something like this: > >interface AttributeList { > int length(); // or maybe size > String getValue(int i); // or maybe valueAt > String getName(int i); // or maybe nameAt > String get(String name); >} > >would be significantly more efficient. > >At the very least provide an isEmpty() so that I don't have to do the >allocation in the common case there are no attributes. I didn't understand this. Why is an AttributeList interface inherently more efficient than AttributeMap? The use of an AttributeMap interface doesn't imply the creation of an object per start tag, any more than AttributeList does. Are you assuming an underlying hashtable implementation (or whatever)? This doesn't have to be the case; you could implement a map interface on top of a list, which would be just as efficient as your "String get(String name)". It seems to me that the metrics of the document and details of the usage case (average number of attributes per tag, need to iterate attributes, need to access attributes by name, etc.) would determine which underlying implementation would be more efficient in which case. I also don't see the need for "isEmpty()". Why not just instantiate a single "empty map" object in the parser and send it whenever the attribute list is empty? Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jft at Psychology.Nottingham.AC.UK Thu Jan 15 11:35:59 1998 From: jft at Psychology.Nottingham.AC.UK (Jeni Tennison) Date: Mon Jun 7 16:59:54 2004 Subject: Documentation of DTDs In-Reply-To: <34B041AA.8D805120@n-space.com.au> References: <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> <199801050122.UAA00342@unready.microstar.com> Message-ID: During the discussion on including comments in SAX, Antony Blakey wrote: >David Megginson wrote: >> In the second case, I think that it would be a very bad idea to >> implement a JavaDoc-type facility using XML comments. JavaDoc has to >> use comments because it is not possible to extend Java syntax; XML >> allows you to define your own grammar, so the documentation can be >> part of the fundamental element structure. For example, instead of >> >> >> >> http://home.sprynet.com/sprynet/dmeggins/ >> dmeggins@microstar.com >> >> >> you should use >> >> >> Record for David Megginson >> http://home.sprynet.com/sprynet/dmeggins/ >> dmeggins@microstar.com >> > >I agree, but your example implies that my comments were about the data, >rather than about the structure itself - I guess I should have pointed >out that I'm interested in comments in the DTD, so that the DTD can be >documented automatically. This is more like javadoc/idldoc. I'd love an >xmldoc tool. I'm guessing now that SAX doesn't give me DTD events. I was thinking about this last night. If there is going to be a means to have documentation within DTDs (and I think there should be), it would be a very good idea to decide on a standard format for that documentation, so that both authors of DTDs and XML application programmers can use it. I can see two good reasons for having documentation within a DTD. The first is for automatic generation of documentation (as XML documents, obviously) in a similar way to javadoc, as mentioned by Antony Blakey. The second is for automatic dialog or pop-up help generation in XML editors. The first need could be satisfied by authors of DTDs writing separate documentation for them: the second need could not. Note also that the second need means that the documentation should be well structured and available online in such a way that an application receiving a DTD can get its documentation too - this means that tools which do a one-off generation of documentation wouldn't cut it. javadoc [1] and dtd2html [2] utilise different methods of supplying documentation for their respective 'code'. javadoc has the programmer write documentation within the code, whereas dtd2html has the DTD author write a separate file containing the documentation. The problem with using the javadoc method is that it would add a lot of gumph to a DTD that the majority of applications (validators, viewers etc.) couldn't care less about. The problem with the dtd2html method is that the documentation isn't immediately *there* for someone editing the DTD. Of the two, I think the dtd2html method probably suits XML better (designed, as it is, for SGML, that isn't too surprising). (BTW, before anyone asks, the reason dtd2html isn't what I have in mind is because of the application-accessibility of the documentation as described above.) So, the solution I'm (tentatively) suggesting is that XML DTDs point to XML documents which contain documentation on the DTD. There are two parts to this, then: firstly, how does the DTD point to its documentation? Secondly, how is the documentation structured? Well, I *think* (and please forgive me if I'm wrong) the answer to the first part is to have a processing instruction within a DTD which points to the documentation. Something like: [Should it just contain a (relative) URL? Is there anything else it needs to contain? Should its format be: ?] The DTD Documentation Markup Language (hence .dtddml ;) document referenced would probably borrow heavily from the format of the documentation for dtd2html and also from DTDs of DTDs or groves or whatever they're called - Peter MR, you've done one, haven't you? I'm very willing and probably able to do such a DTD, but I thought I'd try to get people's opinions on this whole documentation business before doing so. So: - Is there a need for a standard on documentation for XML DTDs? - If so, is the separate-documentation method better than the documentation-in-DTD method? - If so, how should the documentation document be referenced from the DTD? Are there any ideas/suggestions/requirements for what the documentation should contain or what the documentation DTD should look like? Thanks for your comments in advance, Jeni [1] http://java.sun.com/products/jdk/1.1/docs/tooldocs/win32/javadoc.html [2] http://www.oac.uci.edu/indiv/ehood/perlSGML/doc/html/dtd2html.html Jenifer Tennison Department of Psychology, University of Nottingham University Park, Nottingham NG7 2RD, UK tel: +44 (0) 115 951 5151 x8352 fax: +44 (0) 115 951 5324 url: http://www.psychology.nottingham.ac.uk/staff/Jenifer.Tennison/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 15 11:49:32 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) In-Reply-To: <34BD7378.AD58960F@jclark.com> References: <01bd20f3$4d749670$a00b0ac0@pharcyde.poetsoftware.xo.com> <34BD7378.AD58960F@jclark.com> Message-ID: <199801151148.GAA00314@unready.microstar.com> James Clark writes: > Why do we get abstraction and extensibility for attributes but for > nothing else? I'm not happy with the current SAX solution to attributes, and am paying close attention to all of the excellent suggestions on this list. AttributeMap will certainly change somehow in the next release in a few weeks. I think that it was Paul Prescod who pointed out that data entities and notations are basic features of XML in the current version, and that even a simple API should provide access to them. In XML, the only point of contact with notations or data entities is through attribute values, so for the first draft, I put that information into AttributeMap to keep it out of the way of the rest of the interface. Another alternative is to keep AttributeMap simple and to add some query routines to org.xml.sax.Parser: public int getAttributeType (String elname, String aname) public String getEntityNotation (String ename) public String getEntityPublicID (String ename) public String getEntitySystemID (String ename) public String getNotationPublicID (String nname) public String getNotationSystemID (String nname) This is the way that I do it in AElfred, but there I provide a pointer to the parser as the first argument to every callback, so these queries are simpler to perform, and I also provide many other DTD-related queries. I've wanted to keep the Parser interface dead simple, since users need to learn it right away, and to put the complexity in AttributeMap, which users can learn later when they are more advanced: in XML (unlike full SGML), notations and data entities really are purely attribute-related information. The third alternative is, of course, to provide no access to this information at all. I would do this only if I had clear indication from the WG that data entities and notations are deprecated relics from full SGML that will likely be removed in XML 1.1; otherwise, I have to assume that, as in full SGML, they will be the standard method for including non-XML objects such as graphics, audio, and video, and that even a simple API needs to provide access to them. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 15 11:50:58 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:54 2004 Subject: VRML and XML In-Reply-To: <34BD7FFE.27CE@hiwaay.net> References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> Message-ID: <199801151149.GAA00325@unready.microstar.com> len bullard writes: > W. Eliot Kimber wrote: > > > I think that could be pretty useful, especially once VRML browsers let > > you associate presentation styles with element types. After all, > > rendering 3-D objects is not fundamentally different in the abstract > > from rendering 2-D objects, it's still a matter of apply presentation > > style. So why shouldn't XSL be just as useful for VRML worlds as for > > 2-D documents? > > While it is possible to write a declarative architecture > for the static nodes (eg VRML 1.0), it is a very > different beast from a real time simulation > language. As to associating XSL, how useful is > it to associate a style language with a presentation > language that already includes all of the presentation > information required. On a different note, you could use XSL to auto-generate documentation for a VRML world marked up in XML. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Jan 15 12:08:09 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) References: <01bd2199$42d685b0$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <34BDFA75.B423694F@jclark.com> Matthew Gertner wrote: > >At the very least provide an isEmpty() so that I don't have to do the > >allocation in the common case there are no attributes. > > I didn't understand this. Why is an AttributeList interface inherently more > efficient than AttributeMap? The use of an AttributeMap interface doesn't > imply the creation of an object per start tag, any more than AttributeList > does. Because to iterate over the AttributeMap an Enumeration object has to get allocated. AttributeList allows iteration over the attributes without having to do any allocation. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Thu Jan 15 13:02:49 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <01bd21b5$07945240$a00b0ac0@pharcyde.poetsoftware.xo.com> >> >At the very least provide an isEmpty() so that I don't have to do the >> >allocation in the common case there are no attributes. >> >> I didn't understand this. Why is an AttributeList interface inherently more >> efficient than AttributeMap? The use of an AttributeMap interface doesn't >> imply the creation of an object per start tag, any more than AttributeList >> does. > >Because to iterate over the AttributeMap an Enumeration object has to >get allocated. AttributeList allows iteration over the attributes >without having to do any allocation. > >James Okay, I thought you were talking about the instantiation of the map implementation itself. The fact of the matter is that, although there is some overhead to instantiating an iterator object, looping over a list of attributes and doing n string compares is not all that efficient either. It just depends on what your priority is: iterating or name lookup. The instantiation of the object is only necessary if you want to iterate (now I understand what you meant by "isEmpty"). It might not be too off-the-wall to claim that the lookup by name will be the more frequent usage case for SAX. I like the AttributeMap interface with a "getIterator()" method because it is simple, elegant and standard. If there is a general consensus that the performance issue is significant (to be honest I am not entirely convinced that this will be a bottleneck), your AttributeList interface is just fine, since it provides both types of access. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 15 13:51:39 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <3.0.32.19980115054853.00ae25d4@pop.intergate.bc.ca> At 06:48 AM 15/01/98 -0500, David Megginson wrote: >The third alternative is, of course, to provide no access to this >information at all. I would do this only if I had clear indication >from the WG that data entities and notations are deprecated relics >from full SGML that will likely be removed in XML 1.1; otherwise, I >have to assume that, as in full SGML, they will be the standard method >for including non-XML objects such as graphics, audio, and video, and >that even a simple API needs to provide access to them. It is very unlikely that they'll be deprecated. But it is also highly unclear that they'll actually be used very much in the type of lightweight app that I think we're building SAX for. My incliniation would be to get elements, attributes, and text right first, then expand as need becomes clear. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Thu Jan 15 16:34:37 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:54 2004 Subject: Documentation of DTDs In-Reply-To: <98Jan15.063728est.18817@thicket.arbortext.com> References: <34B041AA.8D805120@n-space.com.au> <199801031821.NAA00477@unready.microstar.com> <34B02D5D.10E91A98@n-space.com.au> <199801050122.UAA00342@unready.microstar.com> Message-ID: <3.0.5.32.19980115113447.009d3c60@village.doctools.com> At 06:33 AM 1/15/98 -0500, Jeni Tennison wrote: ... >I was thinking about this last night. If there is going to be a means to >have documentation within DTDs (and I think there should be), it would be a >very good idea to decide on a standard format for that documentation, so >that both authors of DTDs and XML application programmers can use it. > >I can see two good reasons for having documentation within a DTD. The >first is for automatic generation of documentation (as XML documents, >obviously) in a similar way to javadoc, as mentioned by Antony Blakey. The >second is for automatic dialog or pop-up help generation in XML editors. >The first need could be satisfied by authors of DTDs writing separate >documentation for them: the second need could not. Note also that the >second need means that the documentation should be well structured and >available online in such a way that an application receiving a DTD can get >its documentation too - this means that tools which do a one-off generation >of documentation wouldn't cut it. ... Documentation for DTDs isn't just a good idea, it's the law! :-) That is, at least in full SGML, a "DTD" is actually supposed to consist of both the formal part (the markup declarations) and the documentation that explains everything. I believe that the best way to do integrated DTD documentation is to -- surprise! -- write an XML document that contains (and whose structure reveals) information about both the formal and the informal parts of the language being defined. In other words, I would want an XML-based schema language that provides hooks for places to put descriptions. Several such DTD-for-DTDs-and-their-documentation have been written; the XML-Data proposal is the latest public one. It has some features for embedded documentation/description, but it could be taken even further. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mike at datachannel.com Thu Jan 15 17:27:29 1998 From: mike at datachannel.com (Mike Dierken) Date: Mon Jun 7 16:59:54 2004 Subject: VRML and XML Message-ID: <01BD2197.366048E0@NEMO> It may be more interesting to generate VRML from XML via XSL and pass it off to a 3D browser component. Some sources of information might be experienced in a 3D space just as well as, or better than, a text & graphic screen. For example: rows and columns of numbers, a network graph (nodes with many to many relationships), etc. You could also describe an XML 'home world' like an HTML home page, and provide 'alternate text' for 3D objects in the way that you can provide 'alternate text' for HTML's tag. This would allow different ways to experience the same space. Mike -----Original Message----- From: David Megginson [SMTP:ak117@freenet.carleton.ca] Sent: Thursday, January 15, 1998 3:50 AM To: xml-dev@ic.ac.uk Subject: VRML and XML len bullard writes: > W. Eliot Kimber wrote: > > > I think that could be pretty useful, especially once VRML browsers let > > you associate presentation styles with element types. After all, > > rendering 3-D objects is not fundamentally different in the abstract > > from rendering 2-D objects, it's still a matter of apply presentation > > style. So why shouldn't XSL be just as useful for VRML worlds as for > > 2-D documents? > > While it is possible to write a declarative architecture > for the static nodes (eg VRML 1.0), it is a very > different beast from a real time simulation > language. As to associating XSL, how useful is > it to associate a style language with a presentation > language that already includes all of the presentation > information required. On a different note, you could use XSL to auto-generate documentation for a VRML world marked up in XML. All the best, David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu Jan 15 17:38:50 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <199801151738.RAA10793@mail.iol.ie> [Matthew Gertner] >Okay, I thought you were talking about the instantiation of the map >implementation itself. The fact of the matter is that, although there is >some overhead to instantiating an iterator object, looping over a list of >attributes and doing n string compares is not all that efficient either. Given that attribute ordering is never significant could the attributes be provided sorted by name so that by-name look up can be achieved with a binary chop? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lex at www.copsol.com Thu Jan 15 19:07:51 1998 From: lex at www.copsol.com (Alex Milowski) Date: Mon Jun 7 16:59:54 2004 Subject: Interactive Grove Guide Available Message-ID: <199801151904.NAA09185@copsol.com> I have made the "Interactive Grove Guide" available on the Copernican Solutions Incorporated web site at: http://www.copsol.com/sgmlimpl/standards/ - and - http://www.copsol.com/products/daeserver/demoserver This is a new version of the Grove Guide that is far more complete and has been updated with the SGML property set as defined in the latest HyTime standard. It is also an example of a dynamic down-translation server program written completely in Java (no Scheme!) and it uses the DAE Server and DAE SDK. The source code and DAE Server can be found at the DAE SDK and DAE Server products page at: http://www.copsol.com/products/ Although this is the *SGML* property set, XML documents exhibit a subset of this property set. Hence, the SGML property set is a good place to start if you want to understand "all there is to receive" from an XML processor. ...someday there will be an "official" subset of the SGML property set for XML! ;-) ============================================================================== R. Alexander Milowski http://www.copsol.com/ alex@copsol.com Copernican Solutions Incorporated (612) 379 - 3608 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mecom-gmbh at mixx.de Thu Jan 15 19:28:19 1998 From: mecom-gmbh at mixx.de (james anderson) Date: Mon Jun 7 16:59:54 2004 Subject: XML References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> <34BD927E.AF02B3E0@isogen.com> Message-ID: <34BE634B.2CDF6478@mixx.de> greetings, am i the only one who is struck by the frequency with which such clarifying remarks appear on this list? i know there was "SGML" and there was "HTML", but why is does XML have to be "XML"? 1. the PR is so absolutely clear that it is intended to be a notation and not a language. 2. there are other related - but autonomous - standards (XSL, XLL, DOM) which address various of the sorts of semantic concerns which are necessary before a means to encode becomes itself a code/language. that is to say it neither trys to be, nor is, nor needs to be a language. that given, why isn't the constellation something on the order of XMLn, XMLss, XMLls, XMLdom? just wondering, james, W. Eliot Kimber wrote: > XML [operates] at the document representation syntax level [only], so it can > have nothing to say about the semantics of the data represented. On the > other hand, XLL and HyTime (and DSSSL and XSL) operate at the semantic > level and therefore may have lots to say about the semantics of the data > represented. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 15 21:13:50 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:59:54 2004 Subject: VRML and XML References: <01BD2197.366048E0@NEMO> Message-ID: <34BE7B69.4645799F@isogen.com> Mike Dierken wrote: > > It may be more interesting to generate VRML from XML via XSL and pass > it off to a 3D browser component. > Some sources of information might be experienced in a 3D space just as > well as, or better than, a text & graphic screen. For example: rows > and columns of numbers, a network graph (nodes with many to many > relationships), etc. I've been thinking about this (not very deeply) for a couple of years, ever since I built my VRML DTD and realized how easy it is to generate VRML syntax using SGML transformation tools. I was trying to decide if there was an interesting 3-D view of document structure generally. For specialized information types, I think the answer depends on the information and will be clear to those familiar with it. It would be pretty easy to generate a VRML representation of any SGML document using a DSSSL specification and using the SGML transform back end of JADE. It would look something like this: (default ; Default construction rule (case (node-class (current-node)) (("element") (make formatting-specification data: (generate-vrml-representation-of-element (current-node))) (("attribute") (make formatting-specification data: (generate-vrml-representation-of-attribute (current-node))) (else (make formatting-specification data: (generate-default-vrml-node (current-node))))) Where the "generate-vrml-representation-of-x" functions are DSSSL functions that encapsulate the generation of the VRML source using properties of the specified node. I could just never decide what those representations might look like. The HyperG/HyperWave folks have done some interesting work to provide VRML representations of sets of documents. The demo I saw produced a VRML view of documents about Graz Austria, with VRML representations reflecting the kind of information in the documents (buildings, sites, restaurants, etc.). It was pretty cool. Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Jan 15 22:06:56 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:54 2004 Subject: XML as an alternative to Object Serialization? Message-ID: <34BBF1CF.FE7006C2@infinet.com> I have had an idea as a solution to a problem I have, but my problem is much more limited than this idea... One thing I need to know is if anyone is doing any work on higher-level API's for creating XML documents, DTD's etc. I can do this myself as writing an XML Formatter is a hell of a lot easier than writing an XML Parser, but nevertheless I have not seen any tools like this that I am aware of (I think MSXML might have something like that but I do not know). I just want a simple API to make method calls which spits stuff out to a java.io.OuputStream. Now my idea rests upon the power of Reflection in JDK 1.1+. I have had experience in using Reflection to auto-map Java class files and interfaces to a relational database by auto-generating SQL create scripts, and I had the idea that XML could be used as an alternative to Object Serialization for small to medium sized content. XML of course is much fatter as a persistence framework than something like Object Serialization but for web browsers and the web in general, embedding this info via XML in a page may make more sense in some circumstances, especially since you would get the added boon of your Java Object being cached on the users computer. Now doing a good implementation of this would require a bit of work, and probably the best way of doing this might involve screwing around with the Externalizable interfaces, but I just wanted to get some idea from other people as to whether or not they think that this would be a good idea, but more importantly a useful idea. I am an independent programmer, so I can work on this from time to time when my other projects are not so time consuming, but this would not be a trivial task to implement even though I have a pretty good idea of how to do it (or at least an idea of a working implementation). Considering the success that SAX has had to date in its efficiency of being moved along, I thought that this sorta thing might warrant some attention. Any comments? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Jan 15 22:47:08 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 16:59:54 2004 Subject: XML as an alternative to Object Serialization? References: <199801152243.JAA02134@tcltk.anu.edu.au> Message-ID: <34BBFB34.A7D62E46@infinet.com> Steven Ball wrote: > Tyler Baker writes: > > One thing I need to know is if anyone is doing any work on higher-level > > API's for creating XML documents, DTD's etc. I can do this myself as > > writing an XML Formatter is a hell of a lot easier than writing an XML > > Parser, but nevertheless I have not seen any tools like this that I am > > aware of (I think MSXML might have something like that but I do not > > know). I just want a simple API to make method calls which spits stuff > > out to a java.io.OuputStream. > > I have done this as part of my TclXML package - an XML generator. > The generator takes a XML DTD and creates a Tcl command for each element > and entity. A Tcl script then calls these commands, nested to reflect the > resultant document's structure, with appropriate arguments for attributes, > to produce an XML document. > Looked at it briefly and seemed like a great tool for webmasters and web content developers, but I am doing work for a standalone application so that is why I need to do this within Java only. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Jan 15 23:07:03 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:54 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <007701bd2209$ae9e6c70$2ee044c6@donpark> >Because to iterate over the AttributeMap an Enumeration object has to >get allocated. AttributeList allows iteration over the attributes >without having to do any allocation. If the AttributeMap itself implemented Enumeration interface and returned itself as the enumerator, there is no need to allocate an extra enumeration. Multiple enumeration can be supported with a slight twist. Net result is that extra object instantiation is not needed in most case. Don Park xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremie at netins.net Fri Jan 16 00:14:18 1998 From: jeremie at netins.net (Jeremie Miller) Date: Mon Jun 7 16:59:54 2004 Subject: JavaScript parser update and Questions Message-ID: <000f01bd2213$8744d4c0$2801a8c0@jeremie.dbqglass.com> I've just updated my JavaScript parser < http://www.jeremie.com/xparse/ >, and have a few questions... First, the update. Unlike normal software aging, I cut the code size by 50% (below 5k w/o my comments) and increased the speed and compatibility. It should work with almost _any_ incarnation of JavaScript. It now properly and according to spec for a well-formed parser understands elements, attributes, the prolog, comments, processing instructions, and CDATA sections. What I am working on yet is entities and DOM compatibility(just have to print out the spec and read it). My question is this, being a fairly simple parser, how should I handle entities? I'm confused by the spec as to how a well-formed parser should handle them. Should I parse At 17:59 13.01.98 -0500, Tyler Baker wrote: [...] > >Now my idea rests upon the power of Reflection in JDK 1.1+. I have had >experience in using Reflection to auto-map Java class files and >interfaces to a relational database by auto-generating SQL create >scripts, and I had the idea that XML could be used as an alternative to >Object Serialization for small to medium sized content. XML of course >is much fatter as a persistence framework than something like Object >Serialization but for web browsers and the web in general, embedding >this info via XML in a page may make more sense in some circumstances, >especially since you would get the added boon of your Java Object being >cached on the users computer. > [...] It may be interesting for you to check the JSXML work that Bill laForge at OpenGroup is doing (being able to serialize a Bean to XML) at : http://www.camb.opengroup.org/cgi-bin/mailbox.pl As for myself, after a lot of thinking about Java/XML serialization I found that I don't see *real* application for it. It seems evident that (de-)serialization of state in the "native-Java" way is much more efficient then in Java-->XML-->Java way. So effectiveness is certainly not an issue here. On the other hand I don't see a single case where Java/XML serialization may be required in *homogeneous* Java systems. I want to stress that when saying that Java/XML serialization seems useless, I mean exactly the case of *homogeneous* Java systems. To my mind, the whole problem of Java/XML serialization seems to be wrongly formulated. The idea of component state specification through meta information has *much wider scope* and *implications* then just Java/XML serialization. Component serialization in general (not JavaBean only) can be specified in XML. This makes a lot of sense to me. Having *general* XML specification for component state will allow different OO languages/tools interoperate on different platforms and network nodes. Java->XML->Java serialization will make sense only when *general meta component state specification* will be defined universally. In other words general serialization scheme will look something like : X-Object\ /Z-Object \ / Y-Object ------X-Object / \ Z-Object/ \Y-Object Where X,Y, and Z are different OO systems with common notion of class/object as defined by OOP model. Another interesting issue here is that having component (file, object, etc..) specification on meta level (XML) will provide for more uniform run-time service discovery. Here comes component introspecion. Though component introspection can certainly be done in the "native-Java" way (java.beans.BeanInfo), the obvious for me advantage of *XML BeanInfo* is the ability to introspect components described in BeanInfo _markup_ *without the need to pre-load these components (classes) first*. This means that one can make decision of component suitability for some task from its markup only. To make this possible I am working now on BeanInfo markup and framework for bean introspection from its XML spec. So my bottom line is : - Java/XML combination will help a lot for run time service discovery in dynamic, context-based, component frameworks. (Introspection with XML BeanInfo). - General markup (based on XML) for the *component state specification* may be a real break-through for D-O frameworks running across different platforms (including different OO tools) on network. All the best, Dima ----------------- Dmitri Kondratiev dima@paragraph.com 102401.2457@compuserve.com http://www.geocities.com/SiliconValley/Lakes/3767/ tel: 07-095-464-9241 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Jan 16 04:13:43 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:55 2004 Subject: VRML and XML References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> <34BD927E.AF02B3E0@isogen.com> Message-ID: <34BEDE2A.166A@hiwaay.net> W. Eliot Kimber wrote: > Len, > > You are confusing the specification of the data with the execution of > the specification. It's like saying there'd be some difficulty in > representing a programming language using XML syntax--there's not. No. SGML-bred, I understand that completely. MID was a good laboratory for those concepts. I am looking at the practical applications and wondering if there are any right now. *Internet time* has taught me to be self-restrained about what is possible vs what is worth the effort given the need to get the software to market. Whatever else the VRML community does, they are very pragmatic about that. The VRML community is almost completely unique in the close symbiotic relationship in the development lists of the the content and implementation engineers. It has been a raucous but very effective relationship. A syntax specification language can represent whatever needs to be represented. It has been done several times with SGML as we both well know. Building a production-worthy useful language is not a theoretical exercise however. For an XML version of VRML to be useful, there must be some requirements for it which ISO VRML (VRML 97) does not meet. > Using a style language is useful for the same reason it is in SGML: I > want to apply different styles to the same basic data objects. Asking > why it's useful here is like asking why you need styles if you have a > font tag and I know we both know the answer to that one. Yes. If one is considering building reusable objects in scenes, there may be something useful there. However, VRML97 already contains protos and external protos which satisfy that requirement and are expressed in an established syntax for which there are already good and rapidly improving implementations. VRML is designed around an object model and in some respects, is ahead of DOM. > The fact that the presented result happens, in some presentation styles, > to be interactive is completely irrelevant to the issue of representing > the data using XML. In the theoretical sense, yes that is true. In the practical sense of fielding applications, it is also irrelevant. The *presentation styles* of 3D graphics are not that complex and it may be the case that within the current and soon to be fielded platforms, the size of the instance, the speed of the parse, the building of the internal structures, the efficiency of the external interfaces are the critical aspects of the design. > A better question might be: does XLL (or HyTime event schedules or some > combination thereof) provide anything useful in representing the > relationships among the nodes, which is all an event model does (define > relationships or behavior associated with relationships). Yes. I think that is the more interesting question. The linking of VRML is still a list of URI/URLs where the intended semantic is to try the first one and if that doesn't work, try the next one, etc. VRML anchors are not that sophisticated. It was said that because of script nodes, they did not need to be. It is an interesting decision in what it expresses as the preference of the VRML designers with regards to abstractions of semantic linking. However, see below for a real problem which the URLs, scripts, and event models don't solve given the current VRML expressive limits. > XML only operates at the document representation syntax level, so it can > have nothing to say about the semantics of the data represented. On the > other hand, XLL and HyTime (and DSSSL and XSL) operate at the semantic > level and therefore may have lots to say about the semantics of the data > represented. Yes. I think that the project should look at the DOM for its potential to unify the document handling framework. IOW, would the benefits of minimizing the number of interfaces for handling documents of multiple notations outweigh the performance benefits of multiple, notation-specific APIs? Given that the building of a Web page is now approaching the complexity of a Visual Basic application, the size of the browser framework (cum portable operating system) is getting above 15MB, and the willingness of the vendors to increase the number of 5MB+ plugins (notation browsers) within the framework deliverable is waning, there may be benefits. However, in the case of VRML97, at this time, the performance (frame rate, load time, time to load and start included files such as MPEG, WAV, streaming media), is critical. BTW, in the timing problems, VRML browsers still have sticky problems particularly in execution where indeterminate conditions can occur. This makes some problems of sequencing almost intractable for the content providers as the machine characteristics are critical. IrishSpace pushed the envelope. The script I mailed to you is the opening from that project. Just for grins, try to figure out sometime how to get the asteroid to always hit the Amazon Basin and start the transparency that creates the ashcloud to happen exactly when the script says it should. The problem is one in which the audio has begin times and event outs for when the audio is done, but no way to state exactly where an event within the audio occurs. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Jan 16 04:15:59 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:55 2004 Subject: XML References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> <34BD927E.AF02B3E0@isogen.com> <34BE634B.2CDF6478@mixx.de> Message-ID: <34BEA11C.132572E6@technologist.com> james anderson wrote: > > greetings, > > am i the only one who is struck by the frequency with which such clarifying > remarks appear on this list? i know there was "SGML" and there was "HTML", but > why is does XML have to be "XML"? > 1. the PR is so absolutely clear that it is intended to be a notation and not a > language. My definition of language (and it is hardly one I invented!!) is a set (perhaps infinite) of strings. The definition of the language states what strings are in it and what strings are not. The set of strings conforming to PR-xml are in the language and the set not conforming to it are outside. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Jan 16 04:26:04 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:55 2004 Subject: VRML and XML References: <01BD2197.366048E0@NEMO> <34BE7B69.4645799F@isogen.com> Message-ID: <34BEE113.2F44@hiwaay.net> W. Eliot Kimber wrote: > > Mike Dierken wrote: > > > > It may be more interesting to generate VRML from XML via XSL and pass > > it off to a 3D browser component. > > Some sources of information might be experienced in a 3D space just as > > well as, or better than, a text & graphic screen. For example: rows > > and columns of numbers, a network graph (nodes with many to many > > relationships), etc. > > I've been thinking about this (not very deeply) for a couple of years, > ever since I built my VRML DTD and realized how easy it is to generate > VRML syntax using SGML transformation tools. I was trying to decide if > there was an interesting 3-D view of document structure generally. There is an enormous amount of material on the web under the topic "visualiz(s)ation concerned with aspects of documents that benefit from 3D and animation. VRML 1.0 was static and didn't allow experimentation with the moving characteristics of virtual reality. It turns out that these can be vital to a useful 3D representation of document components. Representing the text in the 3D medium is not difficult, but it turns out not to be very useful (hard to read). But the visualization for data mining and browsing by concurrent users is very useful. I am at home. If the snow doesn't catch up tomorrow, I'll bring home the papers and post some URLs. El Nino is striking as I speak. $@%@()&&^@^ Tim Bray did some preliminary experiments in the area of using VRML for web site representation. Tim? len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Fri Jan 16 09:20:42 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:55 2004 Subject: AttributeMap (was Re: Announcement: SAX 1998-01-12 Draft) Message-ID: <01bd225f$ad020470$a00b0ac0@pharcyde.poetsoftware.xo.com> >[Matthew Gertner] >>Okay, I thought you were talking about the instantiation of the map >>implementation itself. The fact of the matter is that, although there is >>some overhead to instantiating an iterator object, looping over a list of >>attributes and doing n string compares is not all that efficient either. > >Given that attribute ordering is never significant could the attributes >be provided sorted by name so that by-name look up can be achieved with >a binary chop? This kind of implementation detail should be hidden from the consumer application. If Jame's AttributeList interface were used, the String get(String) method could be made more efficient through alphabetical sorting, as you suggest. This would, however, require that the attributes be sorted in the first place, so with any significant number of attributes it might be more efficient to instantiate an iterator on a map than to sort the attributes. Of course, the map insert probably has log complexity instead of constant for the list... When you get down to it, we need an iterator-type interface and a map-type interface. The area of controversy seems to be whether these are both provided on top of one implementation, or whether a new implementation is instantiated for the iteration interface. I personally prefer the second variant but I guess it doesn't matter much. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rose at balthazar.com Fri Jan 16 09:37:09 1998 From: rose at balthazar.com (Michael Rose) Date: Mon Jun 7 16:59:55 2004 Subject: VRML and XML Message-ID: Len Bullard writes: >Building a production-worthy useful language >is not a theoretical exercise however. For an XML version of >VRML to be useful, there must be some requirements for it which >ISO VRML (VRML 97) does not meet. 'Different views of the same information' is applicable here. We're writing some 'avatar generation' software where users can make their own avatars. In the first instance the avatars will be 2D, with the later possibility of 3D and VRML avatars. The obvious choice is to have an XML based avatar description with can then be viewed via either 2D or VRML rendering depending on which XSL is used. Is anyone doing anything similar? Cheers, Michael ********************************************************************** Michael Rose email: rose@balthazar.com Balthazar A/S phone: (+45) 33 26 05 15 Rahbeks Alle 9D fax: (+45) 33 26 05 01 DK-1749 Copenhagen DENMARK web: http://www.balthazar.com 'and what is the use of a computer' thought Alice 'without pictures or conversation' with apologies to Lewis Carroll ********************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 11:25:57 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: Interactive Grove Guide Available In-Reply-To: <199801151904.NAA09185@copsol.com> Message-ID: <3.0.1.16.19980116101021.45e74d10@pop3.demon.co.uk> At 13:03 15/01/98 -0600, Alex Milowski wrote: [... exciting software announcement ...] > >...someday there will be an "official" subset of the SGML property set for >XML! ;-) I think this is eagerly awaited. Not everyone wants to start with full SGML (and I think some people find it difficult to determine the SGML/XML borderlines :-) In the meantime an 'unofficial' subset could be very useful. If anyone is (a) knowledgeable (b) enthusiastic I am sure we would welcome this. Perhaps a WG member could speculate on whether such a property set is on the drawing board and whether constructive XML-DEV discussion on this would help. It seems to me that it would have some of the flavour of the latest SAX construction. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 11:32:59 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) In-Reply-To: <000f01bd2213$8744d4c0$2801a8c0@jeremie.dbqglass.com> Message-ID: <3.0.1.16.19980116104144.45e74a20@pop3.demon.co.uk> At 18:13 15/01/98 -0600, Jeremie Miller wrote: >I've just updated my JavaScript parser < http://www.jeremie.com/xparse/ >, >and have a few questions... > >First, the update. Unlike normal software aging, I cut the code size by 50% >(below 5k w/o my comments) and increased the speed and compatibility. It >should work with almost _any_ incarnation of JavaScript. It now properly >and according to spec for a well-formed parser understands elements, >attributes, the prolog, comments, processing instructions, and CDATA >sections. What I am working on yet is entities and DOM compatibility(just >have to print out the spec and read it). Excellent. > >My question is this, being a fairly simple parser, how should I handle >entities? I'm confused by the spec as to how a well-formed parser should >handle them. Should I parse simply handle & < > " ' ? If those are all I should >handle, which ones where? The spec does talk about these things, but I >don't feel right about my interpretation of it. You are not alone :-). There is a difficult decision here for parser writers - do they implement everything in the spec or do they go for a subset? If the latter they are not full XML implementations (and therefore cannot use the label "XML parser"). If the former, they have a *lot* of work to do in understanding the spec and getting it right. I have heralded my own incompetence in understanding NOTATION on this list :-) Every software writer therefore has to decide whether they are going to write a fully conformant XML processor. I am not sure whether *anyone* has yet done this other than James Clark (and those who adapt SGML systems to process XML). [XML *is* SGML, of course, but you have to use a customised SGML declaration for standard SGML tools to read XML.] Most of my work is done with Lark and AElfred and I think they both may have some small bits to fill in (please forgive if I'm wrong :-). For my own parser (Jumbo) I gave up about 6 months ago and do not process entities (other than the hardcoded ones). That means that if I get a document which uses them, my parser fails and I switch to Larkfred. (In fact I'll make one of them the default as soon as the dust settles...) So you have the following choice: - encode the *whole* spec (and nothing but the spec - i.e. no tricky non-compliant extensions) and give yourself the label "conforming XML tool". - encode the bits you feel are cost effective and label it "processes most XML documents, but gives 'Sorry' messages for some". >Other question: Either I can't find it or I am reading right by it, but how >do I handle whitespace in attribute values as a well-formed parser, just >allow anything, including \n? It depends on the type of the attribute value. see 3.3.3 (Attribute value Normalization). If the attribute value is of type CDATA it stays asis, else it gets normalised. How do you tell if it's not CDATA? - there has to be an ATTLIST for the element. This is in the external or internal subsets. So you have to be able to process those. - these subsets can use Parameter Entities. So you have to be able to process those. The alternative is not to process any ATTLISTs. This has the slight disadvantage that it can totally change the meaning of the document. e.g. an attribute value can be an ENTITY which effectively means it is a pointer to a chunk of information, whereas if it is assumed to be CDATA it's just a string. So the bottom line is that *if* the document author uses ENTITYs, and your software doesn't then you will end up with something radically different from what the author intended. This may or may not matter. If you are the author of the document as well as the parser, then you can make a bargain with yourself that you will never use ENTITYs so your software doesn't need to. If you then want other people to use your software you either have to add in entity processing OR give them a statement that you cannot process the document. What you must not do (IMO) is to ignore ENTITYs and assume the result is more or less OK :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 11:38:11 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: XML as an alternative to Object Serialization? In-Reply-To: <34BBFB34.A7D62E46@infinet.com> References: <199801152243.JAA02134@tcltk.anu.edu.au> Message-ID: <3.0.1.16.19980116110238.45e76c00@pop3.demon.co.uk> At 18:39 13/01/98 -0500, Tyler Baker wrote: > > >Steven Ball wrote: > >> >> I have done this as part of my TclXML package - an XML generator. >> The generator takes a XML DTD and creates a Tcl command for each element >> and entity. A Tcl script then calls these commands, nested to reflect the >> resultant document's structure, with appropriate arguments for attributes, >> to produce an XML document. >> [Haven't seen the full posting... has TclXML been announced?] > > Looked at it briefly and seemed like a great tool for webmasters and web >content developers, but I am doing work for a standalone application so that is >why I need to do this within Java only. Tcl is a great language for this sort of thing. Joe English wrote CoST in tcl and I added a tk GUI (costwish) which took exactly this approach. (I even used colonized names :-). I miss the ability in Java to add scripts/commands to elements "in clear". The JUMBO approach is to create a Java class for each element (or the subset that I don't want default processing for) and to point to this class (bytecode) using a schema file (using the prototypic namespace ideas in the public domain). Example: points to the PLAY namespace. Within the schema file is something like: ... ... lots of lovely XML-structured help ... jumbo.play.STAGEDIRNode.class jumbo/play/stagedir.gif Of course I make this up as I go along, and I hope that there will be some public confluence of approaches here :-). The software is then able to: - recognise PLAY:STAGEDIR as belonging to a schema. - find the icon and draw a pretty picture - locate the help tree under a help icon - load the STAGEDIRNode class. This class overrides some or all methods such as process(), display(), toString(), toSGML(), drawUnscaledObject(Graphics), editContent(), editAttributes(). etc. (Of course editContent() is a no-op for PLAY :-) If the document contains a single namespace, but does not have colonized names, or has colonized names and non-colonized names I have a fudge which works quite nicely but probably shouldn't be publicly posted :-) or it might make some people ill. P. BTW I shall be releasing a new snapshot of JUMBO with this feature before Feb 4 (that's a deadline). Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 14:28:32 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml Message-ID: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> I thought it would be a useful exercise to parse the XML version of the XML PR, but I've run into minor problems. Using Netscape I downloaded the version with TimBray's name and 3 December and have tried running it with Lark and AElfred. The first problem is that both throw errors because the DOCTYPE contains a reference to "spec.dtd". No spec.dtd was available, so the only remedies are: - edit out the reference to spec.dtd (i.e. use Message-ID: <34BC0B6F.A5A00EED@infinet.com> Dmitri Kondratiev wrote: > At 17:59 13.01.98 -0500, Tyler Baker wrote: > [...] > > > >Now my idea rests upon the power of Reflection in JDK 1.1+. I have had > >experience in using Reflection to auto-map Java class files and > >interfaces to a relational database by auto-generating SQL create > >scripts, and I had the idea that XML could be used as an alternative to > >Object Serialization for small to medium sized content. XML of course > >is much fatter as a persistence framework than something like Object > >Serialization but for web browsers and the web in general, embedding > >this info via XML in a page may make more sense in some circumstances, > >especially since you would get the added boon of your Java Object being > >cached on the users computer. > > > [...] > > It may be interesting for you to check the JSXML work that Bill laForge at > OpenGroup is doing (being able to serialize a Bean to XML) at : > > http://www.camb.opengroup.org/cgi-bin/mailbox.pl > Thanx for the pointer. I am not surprised someone is already doing this. Another thing I think would possibly be a good idea that I have heard some discussion about is a universal component architecture that is interpreted at run time. The work that has gone on with VRML over the last few years seems like a start for this, even though VRML's application is more 3D in scope. Hey, I love JavaBeans and yes there is the JavaBeans -> ActiveX bridge, but I think that some sort of true universal component architecture based somewhat on MVC with the model being something like XML, the view being something like VRML, and the controller being a platform specific event handler like in Microsoft's DHTML would be a true first start. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 16 15:46:12 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml In-Reply-To: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> Message-ID: <199801161543.KAA01186@unready.microstar.com> Peter Murray-Rust writes: > I thought it would be a useful exercise to parse the XML version of the XML > PR, but I've run into minor problems. Using Netscape I downloaded the > version with TimBray's name and 3 December and have tried running it with > Lark and AElfred. > > The first problem is that both throw errors because the DOCTYPE contains a > reference to "spec.dtd". No spec.dtd was available, so the only remedies are: > - edit out the reference to spec.dtd (i.e. use - make a dummy spec.dtd > > When I tried the first with AElfred it found an illegal character (-96), > which I take to be 160 (== nbsp). My edit was done with WordPad. Where did > this come from? [Note that AElfred seems quite happy with Lark.xml - Tim's > documentation for lark in XML]. The XML source for the PR is encoded in ISO-8859-1 but has no encoding declaration (so AElfred assumes UTF-8, and reports an encoding error, though not very helpfully, when it finds an invalid UTF-8 sequence). The WG is aware of the problem. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 16:12:04 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml In-Reply-To: <199801161543.KAA01186@unready.microstar.com> References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> At 10:43 16/01/98 -0500, David Megginson wrote: >Peter Murray-Rust writes: > >The XML source for the PR is encoded in ISO-8859-1 but has no encoding >declaration (so AElfred assumes UTF-8, and reports an encoding error, >though not very helpfully, when it finds an invalid UTF-8 sequence). >The WG is aware of the problem. Thanks. I am also aware of it now :-). Can I make the assumption that: - ISO-8859-1 and UTF-8 look identical to not-very-experienced humans. - in principle I should be able to sort this by adding something like to the top of the document - in practice this fails because by the time it gets to the encoding declaration it has already assumed the encoding is UTF-8 and has crashed :-) I am not quite clear why we need this problem. Do different tools emit different encodings? If so, what should I work with?. Can I convert this document? I know there has been lots of important discussions about encodings (which I have not always read very carefully), so an authoritative statement from a WG member would help at least one human :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 16 16:41:07 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:55 2004 Subject: Character Encoding and the XML PR (was Re: PR.xml) In-Reply-To: <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> <199801161543.KAA01186@unready.microstar.com> <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> Message-ID: <199801161638.LAA01795@unready.microstar.com> Peter Murray-Rust writes: > Thanks. I am also aware of it now :-). Can I make the assumption that: > > - ISO-8859-1 and UTF-8 look identical to not-very-experienced humans. They look identical to most English speakers, but differ in their treatment of accented characters (> 0x7f), so French and German speakers probably notice. > - in principle I should be able to sort this by adding something like > > > to the top of the document Correct. The other alternative is to configure your web server to send the encoding ISO-8859-1 in the HTTP header for this document if the text/xml MIME type is approved, but the problem will reappear if you download the file and the parse it on your own system. > - in practice this fails because by the time it gets to the encoding > declaration it has already assumed the encoding is UTF-8 and has crashed :-) It should not fail with AElfred -- I just downloaded the PR and added your XML declaration to the top, and AElfred reported no errors. In fact, the XML declaration is guaranteed to use only ASCII characters, which are the same in UTF-8 and ISO-8859-*. AElfred is very careful not to try to read too far until the document until it has discovered whether there is an explicit encoding declaration. > I am not quite clear why we need this problem. Do different tools emit > different encodings? If so, what should I work with?. Can I convert this > document? ISO-8859-1, which is used for most web pages, contains characters only for Western European languages. UTF-8 can encode any Unicode characters up to 0xff (and a little higher with surrogates), so it can handle Kanji, Han Chinese, Arabic, etc. The PR rightly specifies that any entity that begins with neither an encoding declaration nor a byte-order mark (for UCS-2) should be assumed to be encoded in UTF-8. Conversion should be fairly simple -- take a look at the AElfred source to see how the different encodings are constructed. Just for the record, AElfred accepts the following encodings, and to my knowledge, supports them completely and correctly to the extent allowed by Java's 16-bit characters and by surrogates: - UTF-8 - ISO-10646-UCS-2 (both byte orders) - ISO-10646-UCS-4 (four byte orders) - UTF-16 - ISO-8859-1 All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Jan 16 16:44:33 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml Message-ID: <199801161644.QAA27589@mail.iol.ie> [Peter] >I know there has been lots of important discussions about encodings (which >I have not always read very carefully), so an authoritative statement from >a WG member would help at least one human :-) I have found it quite difficult to get information about Unicode and its various transformation encodings. I know that "buy the Unicode book" is a way around this but time is pressing. Anyone got any good pointers on the Web? Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jimg at digitalthink.com Fri Jan 16 16:51:45 1998 From: jimg at digitalthink.com (Jim Gindling) Date: Mon Jun 7 16:59:55 2004 Subject: Documentation of DTDs Message-ID: <01BD225B.D24641F0.jimg@digitalthink.com> XMLers, > - Is there a need for a standard on documentation for XML DTDs? Definitely yes! Having programmed in Java for over a year now, and having previously programmed in C++ for quite awhile, I can attest to the value of having a JavaDoc type mechanism, which is only possible by standardizing the documentation style. It is easy to use, and reading through the JavaDoc output during design-reviews and the like makes life much easier than reading through the code itself. > - If so, is the separate-documentation method better than the > documentation-in-DTD method? Although I am still an XML newbie, I strongly recommend using the documentation-in-DTD method, again based on my programming experience. The closer the documentation is to thing being documented, the more likely it is to be up-to-date. One of the nice features of Java (IMHO) is that there is no separate interface file. In C++, classes are declared in header files, and defined in source files. Since the declaration is not very 'close' to the definition, its comments are frequently out-of-date. Even when the declaration and definition are in the same file, the further away the comments are from the code, the more likely they are to be out-of-date. For example, class comments are more likely to be obsolete than method comments, which are more likely to be obsolete than code-fragment comments. Thanks for your time! Jim On Thursday, January 15, 1998 3:33 AM, Jeni Tennison [SMTP:jft@Psychology.Nottingham.AC.UK] wrote: > During the discussion on including comments in SAX, Antony Blakey wrote: > >David Megginson wrote: > >> In the second case, I think that it would be a very bad idea to > >> implement a JavaDoc-type facility using XML comments. JavaDoc has to > >> use comments because it is not possible to extend Java syntax; XML > >> allows you to define your own grammar, so the documentation can be > >> part of the fundamental element structure. For example, instead of > >> > >> > >> > >> http://home.sprynet.com/sprynet/dmeggins/ > >> dmeggins@microstar.com > >> > >> > >> you should use > >> > >> > >> Record for David Megginson > >> http://home.sprynet.com/sprynet/dmeggins/ > >> dmeggins@microstar.com > >> > > > >I agree, but your example implies that my comments were about the data, > >rather than about the structure itself - I guess I should have pointed > >out that I'm interested in comments in the DTD, so that the DTD can be > >documented automatically. This is more like javadoc/idldoc. I'd love an > >xmldoc tool. I'm guessing now that SAX doesn't give me DTD events. > > I was thinking about this last night. If there is going to be a means to > have documentation within DTDs (and I think there should be), it would be a > very good idea to decide on a standard format for that documentation, so > that both authors of DTDs and XML application programmers can use it. > > I can see two good reasons for having documentation within a DTD. The > first is for automatic generation of documentation (as XML documents, > obviously) in a similar way to javadoc, as mentioned by Antony Blakey. The > second is for automatic dialog or pop-up help generation in XML editors. > The first need could be satisfied by authors of DTDs writing separate > documentation for them: the second need could not. Note also that the > second need means that the documentation should be well structured and > available online in such a way that an application receiving a DTD can get > its documentation too - this means that tools which do a one-off generation > of documentation wouldn't cut it. > > javadoc [1] and dtd2html [2] utilise different methods of supplying > documentation for their respective 'code'. javadoc has the programmer > write documentation within the code, whereas dtd2html has the DTD author > write a separate file containing the documentation. The problem with using > the javadoc method is that it would add a lot of gumph to a DTD that the > majority of applications (validators, viewers etc.) couldn't care less > about. The problem with the dtd2html method is that the documentation > isn't immediately *there* for someone editing the DTD. Of the two, I think > the dtd2html method probably suits XML better (designed, as it is, for > SGML, that isn't too surprising). (BTW, before anyone asks, the reason > dtd2html isn't what I have in mind is because of the > application-accessibility of the documentation as described above.) > > So, the solution I'm (tentatively) suggesting is that XML DTDs point to XML > documents which contain documentation on the DTD. There are two parts to > this, then: firstly, how does the DTD point to its documentation? > Secondly, how is the documentation structured? > > Well, I *think* (and please forgive me if I'm wrong) the answer to the > first part is to have a processing instruction within a DTD which points to > the documentation. Something like: > > > > [Should it just contain a (relative) URL? Is there anything else it needs > to contain? Should its format be: ?] > > The DTD Documentation Markup Language (hence .dtddml ;) document referenced > would probably borrow heavily from the format of the documentation for > dtd2html and also from DTDs of DTDs or groves or whatever they're called - > Peter MR, you've done one, haven't you? I'm very willing and probably able > to do such a DTD, but I thought I'd try to get people's opinions on this > whole documentation business before doing so. > > So: > - Is there a need for a standard on documentation for XML DTDs? > - If so, is the separate-documentation method better than the > documentation-in-DTD method? > - If so, how should the documentation document be referenced from the DTD? > Are there any ideas/suggestions/requirements for what the documentation > should contain or what the documentation DTD should look like? > > Thanks for your comments in advance, > > Jeni > > [1] http://java.sun.com/products/jdk/1.1/docs/tooldocs/win32/javadoc.html > [2] http://www.oac.uci.edu/indiv/ehood/perlSGML/doc/html/dtd2html.html > > Jenifer Tennison > Department of Psychology, University of Nottingham > University Park, Nottingham NG7 2RD, UK > tel: +44 (0) 115 951 5151 x8352 > fax: +44 (0) 115 951 5324 > url: http://www.psychology.nottingham.ac.uk/staff/Jenifer.Tennison/ > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jkl at osc.edu Fri Jan 16 16:55:14 1998 From: jkl at osc.edu (Jan Labanowski) Date: Mon Jun 7 16:59:55 2004 Subject: UNICODE In-Reply-To: Mail from 'Sean Mc Grath ' dated: Fri, 16 Jan 1998 16:44:23 GMT Message-ID: <199801161654.LAA05458@krakow.osc.edu> It is terribly outdated, but you can browse my secret {:-)} ftp://192.148.249.121/pub/cee/unicode Jan Labanowski jkl@osc.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Jan 16 17:34:02 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml Message-ID: <3.0.32.19980116092525.00aebbb4@pop.intergate.bc.ca> At 04:44 PM 16/01/98 GMT, Sean Mc Grath wrote: >I have found it quite difficult to get information about Unicode and >its various transformation encodings. I know that "buy the Unicode book" is a >way around this but time is pressing. Buy the Unicode book. -Tim (sorry) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From grk at arlut.utexas.edu Fri Jan 16 17:48:39 1998 From: grk at arlut.utexas.edu (Glenn R. Kronschnabl) Date: Mon Jun 7 16:59:55 2004 Subject: PR.xml In-Reply-To: Your message of "Fri, 16 Jan 1998 16:44:23 GMT." <199801161644.QAA27589@mail.iol.ie> Message-ID: <199801161748.LAA26678@mail-firewall.arlut.utexas.edu> In message <199801161644.QAA27589@mail.iol.ie> you write: > >Anyone got any good pointers on the Web? Did you look at: http://www.unicode.org/ Cheers, Glenn -------------------- Glenn R. Kronschnabl Applied Research Laboratories | grk@arlut.utexas.edu (PGP/MIME ok) The University of Texas at Austin | http://www.arlut.utexas.edu/~grk PO Box 8029, Austin, TX 78713-8029 | (Ph) 512.835.3642 (FAX) 512.835.3808 10,000 Burnet Road, Austin, TX 78758 | ... but an Aggie at heart! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hb at ix.heise.de Fri Jan 16 18:43:16 1998 From: hb at ix.heise.de (Henning Behme) Date: Mon Jun 7 16:59:55 2004 Subject: Character Encoding and the XML PR (was Re: PR.xml) References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> <199801161543.KAA01186@unready.microstar.com> <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> <199801161638.LAA01795@unready.microstar.com> Message-ID: <34BFA9E6.90A0FB24@ix.heise.de> David Megginson writes: > They look identical to most English speakers, but differ in their > treatment of accented characters (> 0x7f), so French and German > speakers probably notice. > Absolutely correct. I just spent quarters of hours with a small example which contained German Umlaute (ä ...) and I had to switch to ISO-8859-1 to avoid Aelfred's fatal remark :-) After the change Aelfred parsed the document(s) flawlessly ... Best regards, Henning Behme iX - Magazin fuer professionelle Informationstechnik Helstorfer Str. 7 * 30625 Hannover * Germany http://www.heise.de/ix/ * +49 511 5352-374 * -361 (Fax) ------ White, adj. and n. Black (Ambrose Bierce) ------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Jan 16 19:01:46 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:59:55 2004 Subject: Character Encoding and the XML PR (was Re: PR.xml) Message-ID: <199801161905.GAA07438@jawa.chilli.net.au> > From: David Megginson > > - in principle I should be able to sort this by adding something like > > > > > > to the top of the document > > Correct. The other alternative is to configure your web server to > send the encoding ISO-8859-1 in the HTTP header for this document if > the text/xml MIME type is approved, but the problem will reappear if > you download the file and the parse it on your own system. Because the XML encoding declaration is not required (in the sense that it is voluntary, effectively), there is scope for people to generate bad documents. Similarly, correctly configuring a webserver to provide appropriate MIME charset parameter values is voluntary. Obviously, the character encoding is not something that lay users will know about. Which means that it is something that XML software developers and website administrators must be on top of it. For example, even to make a convention like "All XML documents generated at this site must use ISO 8859-1 encoding, and this encoding must be correctly labelled in the header and correctly set in any webserver configuration files" does not require any kind of extensive understanding of character sets and encodings. And it will get most Western sites out of any problems. A software developer (except for a LISP one) probably does not consider it strange that they must know the different types of numbers available to them: integers, floating point; signed and unsigned; long, int; etc. These are taught in University courses; now we have a WWW, programmers will have to become more aware of encoding and character set issues (but still not a great deal) just as a matter of course. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Fri Jan 16 19:09:41 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:55 2004 Subject: XML In-Reply-To: <98Jan15.231704est.18817@thicket.arbortext.com> References: <3.0.1.16.19980113163928.2a7f45ee@pop3.demon.co.uk> <3.0.1.16.19980114005349.2d1f45ba@pop3.demon.co.uk> <3.0.1.16.19980114081621.3727d1b0@pop3.demon.co.uk> <34BD4E7A.1230@hiwaay.net> <34BD582C.E5873BC4@isogen.com> <34BD7FFE.27CE@hiwaay.net> <34BD927E.AF02B3E0@isogen.com> <34BE634B.2CDF6478@mixx.de> Message-ID: <3.0.5.32.19980116140945.00a69100@village.doctools.com> At 06:51 PM 1/15/98 -0500, Paul Prescod wrote: >james anderson wrote: >> >> greetings, >> >> am i the only one who is struck by the frequency with which such clarifying >> remarks appear on this list? i know there was "SGML" and there was "HTML", but >> why is does XML have to be "XML"? >> 1. the PR is so absolutely clear that it is intended to be a notation and not a >> language. > >My definition of language (and it is hardly one I invented!!) is a set >(perhaps infinite) of strings. The definition of the language states >what strings are in it and what strings are not. The set of strings >conforming to PR-xml are in the language and the set not conforming to >it are outside. The "Language" part of XML (and SGML) is not so much the problem, because they're both obviously languages (according to the commonly understood definition that Paul provided). Calling them "Markup Languages" is where the names really get confusing because we tend to think of "tag set" as a markup language. To be really proper, I would say that they're markup *meta*languages, or markup language *generators*. Or languages *for* [making] markup. These aren't very pretty options, though... Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Jan 16 23:48:35 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:55 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) Message-ID: <3.0.1.16.19980116184218.4e4fd7a6@pop3.demon.co.uk> FORWARDED from David Durand (who has mailer problems) [mentally subtract one level of >s] >Return-Path: >From: "David G. Durand" >Date: Fri, 16 Jan 1998 11:37:28 -0500 > >On Jan 16, 10:41am, Peter Murray-Rust wrote: >> You are not alone :-). There is a difficult decision here for parser >> writers - do they implement everything in the spec or do they go for a >> subset? If the latter they are not full XML implementations (and therefore >> cannot use the label "XML parser"). If the former, they have a *lot* of >> work to do in understanding the spec and getting it right. I have heralded >> my own incompetence in understanding NOTATION on this list :-) > >But that got better, right? > >> >> .... [trimmed] >> >> So you have the following choice: >> - encode the *whole* spec (and nothing but the spec - i.e. no tricky >> non-compliant extensions) and give yourself the label "conforming XML tool". >> - encode the bits you feel are cost effective and label it "processes >most >> XML documents, but gives 'Sorry' messages for some". > >No, this is a choice that it's irresponsible to present, in my opinion. The >point of XML is to be a _standard_ format. That means that you should use it, >or not use it. If you're not willing or able to write a conforming parser, then >you should use one or the other of the publically available ones -- even if >they have bugs, they are under active development and wider use -- the bugs >will probably be fixed, with a compatible interface. > >Tracking any standard is hard work, because of the way standards have to be >written, because some things in them are hard to understand, because people are >not perfect and our standards never are either. > >In other words, use XML, or don't use XML, but don't muddy the waters by >propagating a host of almost-XMLs. There's enough free software out there >already that you can come very close to conformance already, and expect to >reach conformance simply by use of FTP. > >[digression: my favorite standard _is_ perfect. Rough rendering from memory: >"The musical note A above middle C shall be defined to be 440 cycles per >second." A 1-page ISO standard with no ambiguities or flaws.] > >>..... > >> So the bottom line is that *if* the document author uses ENTITYs, and your >> software doesn't then you will end up with something radically different >> from what the author intended. This may or may not matter. > >> If you are the author of the document as well as the parser, then you can >> make a bargain with yourself that you will never use ENTITYs so your >> software doesn't need to. > >But of course, the documents that one writes with the intention of never >sharing them are few and (for obvious reasons) rarely involve special software. > >>If you then want other people to use your >> software you either have to add in entity processing OR give them a >> statement that you cannot process the document. What you must not do (IMO) >> is to ignore ENTITYs and assume the result is more or less OK :-) > >I'd say that if you're ignoring entities, the most mention of XML that is worth >making is to say "an XML-like language, defined by the following grammar: ..." >In other words, if you're not using XML, make clear that you're defining a new >language (and make clear _what_ that language actually is!). > > >------------------------------------------+---------------------------- >David Durand dgd@cs.bu.edu| david@dynamicDiagrams.com >Boston University Computer Science | Dynamic Diagrams >http://www.cs.bu.edu/students/grads/dgd/ | http://dynamicDiagrams.com/ > | MAPA: mapping for the WWW > > >---End of forwarded mail from Mail Delivery System > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Jan 17 00:47:18 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:55 2004 Subject: VRML and XML (Editors) References: Message-ID: <34BFFF4B.1D16@hiwaay.net> Michael Rose wrote: Michael Rose writes: > Len Bullard writes: > > >Building a production-worthy useful language > >is not a theoretical exercise however. For an XML version of > >VRML to be useful, there must be some requirements for it which > >ISO VRML (VRML 97) does not meet. > > 'Different views of the same information' is applicable here. We're > writing some 'avatar generation' software where users can make their own > avatars. In the first instance the avatars will be 2D, with the later > possibility of 3D and VRML avatars. The obvious choice is to have an XML > based avatar description with can then be viewed via either 2D or VRML > rendering depending on which XSL is used. > > Is anyone doing anything similar? Not in XML. Producing 2D and 3D from XSL is a neat idea. Are you working with H-ANIM spec? The issue of domain-specific editors is a hot one in VRML for sure. VRML is scene description language consisting of a set of geometric primitives, indexed face sets, extrusion, texture, script nodes, routes, anchors, etc. A file contains multiple hierarchies of group types (eg, groups, transforms, etc) in nested coordinate spaces. This is very good for describing the world generically, but at a level of abstraction not quick to learn or use. It requires a critical skill of geometric decomposition and assembly that takes considerable practice to master. Higher level editors are needed. Yet, I really don't want to have to keep integrating multiple editor sources. An editor that can be initialized for diffent types of work in VRML would be a very good thing. Personally, I like the V-Realm Builder tree interface combined with the visualization and dialogs. That works very easily. XML can be a way to create differnt tree types so that not only avatar makers, but many kinds of higher level objects can be created in the same interface. One big bonus of SGML and I hope XML tools are DTD-enabled editors. But the one level of abstraction typical of DTD design of the eighties and early nineties has been a limit on what we could express. What is the status of architectural forms in XML right now? My question: this kind of editor depends on a relatively fast rendering response as the editing tools are used. I wonder how much that slows down if XML editors are used with VRML/OpenGL rendering systems. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Jan 17 04:36:07 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:55 2004 Subject: Character Encoding and the XML PR (was Re: PR.xml) References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> <199801161543.KAA01186@unready.microstar.com> <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> <199801161638.LAA01795@unready.microstar.com> Message-ID: <34C026F7.CB0BDC78@jclark.com> David Megginson wrote: > AElfred accepts the following encodings, and to my > knowledge, supports them completely and correctly to the extent > allowed by Java's 16-bit characters and by surrogates: > > - UTF-8 > - ISO-10646-UCS-2 (both byte orders) > - ISO-10646-UCS-4 (four byte orders) > - UTF-16 > - ISO-8859-1 Are you saying that Java's 16-bit characters prevent complete support for some of those encodings in an XML parser? If so, I don't see why, since XML doesn't allow characters >= 0x110000, all legal XML characters are representable in UTF-16 and hence in Java. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremie at netins.net Sat Jan 17 07:40:07 1998 From: jeremie at netins.net (Jeremie Miller) Date: Mon Jun 7 16:59:55 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) Message-ID: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> > [paragraphs removed] > >So you have the following choice: > - encode the *whole* spec (and nothing but the spec - i.e. no tricky >non-compliant extensions) and give yourself the label "conforming XML tool". > - encode the bits you feel are cost effective and label it "processes most >XML documents, but gives 'Sorry' messages for some". More questions/issues then: A well-formed XML document is not required to have a DTD, internal or external, correct? Is a well-formed parser not an XML parser that does not have access to or does not process a DTD, internal or external? I guess I haven't found a clear definition of what a well-formed parser is yet. If this is true, then a well-formed parser doesn't even have to acknowledge that entities exist except for the built in ones, and absolutely all whitespace is preserved, right? Thanks, Jeremie Miller jer@jeremie.com http://www.jeremie.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at light.demon.co.uk Sat Jan 17 10:41:55 1998 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) In-Reply-To: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> Message-ID: <6nBs6KAgfIw0EwNw@light.demon.co.uk> In message <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com>, Jeremie Miller writes >More questions/issues then: > >A well-formed XML document is not required to have a DTD, internal or >external, correct? Is a well-formed parser not an XML parser that does not >have access to or does not process a DTD, internal or external? I guess I >haven't found a clear definition of what a well-formed parser is yet. > >If this is true, then a well-formed parser doesn't even have to acknowledge >that entities exist except for the built in ones, and absolutely all >whitespace is preserved, right? I don't think that this line of reasoning is going to be very helpful to you. While a well-formed XML document is not required to have a DTD, it is perfectly entitled to have one if it wants to. If this includes an internal DTD subset, this will be physically part of the file your parser is trying to parse. So it follows that _any_ XML parser has to be able to parse the internal subset correctly, simply to arrive reliably at the root element. Your logic is tending towards a situation where your 'partial XML parser' would be unable to parse valid XML documents, which is certainly not what is intended. The idea is that _every_ XML document is well- formed (and ergo parsable by a 'well-formed parser'), and some go on to the sunlit uplands of validity. All whitespace is preserved by any XML parser. The only refinement is the a validating parser will flag some of this whitespace as 'ignorable'. It's up to the XML application to decide what it does with this information. Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 17 11:15:36 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) In-Reply-To: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> Message-ID: <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> At 01:38 17/01/98 -0600, Jeremie Miller wrote: >> [paragraphs removed] > >> >>So you have the following choice: >> - encode the *whole* spec (and nothing but the spec - i.e. no tricky >>non-compliant extensions) and give yourself the label "conforming XML >tool". >> - encode the bits you feel are cost effective and label it "processes most >>XML documents, but gives 'Sorry' messages for some". [... picking up some of David Durand's concerns ...] I appreciate the strength of David's arguments and personally will wish to work with totally XML-compliant software. However it is a *lot* of work. One design goal (4 in spec) is that it should be "easy to write programs which process XML documents". If that is interpreted that it is "easy to write software that processes *all* XML documents, throwing errors wherever one is required", then that goal is already lost. For example, James Clark has come up with about 140 carefully incorrect XML documents for testing parsers. DavidM has said that AElfred spots 80% of them, but that the other 20% would increase AElfred's size and decrease its speed. [And probably involve the author in a lot more work.] I'm not making a moral judgment - simply reporting facts in the PD. Personally I think that XML is overly complex for goal 4 and have been privileged to be able to say so on numerous occasions. However I accept the consensus and will do what I can to support it. However, I think there will be domains where the full functionality (or at least the full syntax) of XML will not be used. In that case there will be "simple tools" that process XML documents. Not *all* XML documents, but a lot. It seems to me reasonable that these tools can tell the user if they can't process a document. It's common for compilers to say "sorry, this expression is just too complicated for me to deal with - you'll have to break it up a bit". I can see a tool saying "sorry, I don't deal with CDATA; please try another parser". [The reason I have several parsers running under JUMBO is that - at this stage - they all have things they can't do...] The WG has (I think rightly) said that there should not be conformance levels in XML. [For those not familiar with SGML, there are a large number of different options, many of which are not supported by many parsers.] But I suspect there will be a number of tools which don't support the whole spec - this is a neutral statement. And there will be a number of documents that don't use the whole functionality of XML - this is also a neutral statement. We have frequently talked about the Desperate Perl Hacker writing tools which are sufficient to process a class of XML documents, but not all. I can see convergence between these activities. > >More questions/issues then: > >A well-formed XML document is not required to have a DTD, internal or >external, correct? Correct. The inverse can be stated as "if a document does not have a DTD subset , then it can only be well-formed". > Is a well-formed parser not an XML parser that does not >have access to or does not process a DTD, internal or external? I guess I >haven't found a clear definition of what a well-formed parser is yet. I think we are all looking for enlightenment in this area. There are at least the following categories: A Document + DTD + request to validate document. Requires a validating parser. B Document + full DTD but no request to validate. C Document + parts of a DTD (e.g. a few ELEMENTs and ATTLISTs, maybe an external subset which covers some of the ELEMENTs in the document). D Document with no internal or external subset. Can only be well-formed. What the difference between A and B is is not clear to me. IMO there are several people/robots who can urge that a document be validated (author/server/client/application/reader). What is clear is that *all the information in the DTDs must be processed and the document altered accordingly*. Note that Lark and AElfred both throw errors for if bar.dtd cannot be found. This is reasonable (though frustrating) since bar.dtd can alter the information in the document. NOTE BTW. If an entity is declared in both the internal and external subsets then the one in the internal subset is processed first. [This fooled me for some time because the IS occurs 'later' in the physical document...] C is similar to B, but validation is not possible. It is *essential* that if ATTLISTs and ENTITYs (and NOTATION) exist, then the information in them MUST be applied to the document. I think it is here that the differences of opinion occur. If I get a document with a NOTATION, I may just say "sorry, I can't grok NOTATION, so bomb out", but others see this as an unacceptable position. D seems to me entirely acceptable. If there is no DTD subset, then a parser can be cleanly built which deals with exactly what is potentially carried in well-formed/no_subset documents [you can see we need a terminology here :-)] >If this is true, then a well-formed parser doesn't even have to acknowledge >that entities exist except for the built in ones, NO. *IFF* an ENTITY is declared (case C), the parser MUST process it. Otherwise the content of the emitted information is incorrect. If a WF document contains a reference to an entity (e.g. &foo;) then a 'correct' document automatically falls into (C). A WF/no_subset parser can then only report that an undeclared entity was discovered (and that even it it had been declared, that parser couldn't manage it). >and absolutely all whitespace is preserved, right? Yes :-). The *application* can throw this away, the parser can't. So JUMBO will soon have the options "discard all PCDATA elements which contain only whitespace", or "ignore all [these elements] when emitted by a parser." A human has to press the button to make this happen :-). NOTE that it is possible that a subset in a (C) document can contain enough information to detect what the parser could do with whitespace. Whether it should *act* on that information is unclear. For example, the single declaration: says that FOO contains element content and therefore cannot contain PCDATA. Any whitespace PCDATA is therefore "ignorable". This information is not sufficient to *validate* the document (there are no declarations for BAR and PLUGH, for example). The declaration allows PCDATA, so doesn't help much. Some people have argued for a content model which includes something like #ANYNONPCDATA, but that is not legal XML. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Jan 17 11:39:18 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:59:56 2004 Subject: PR.xml Message-ID: <199801171139.LAA14393@GPO.iol.ie> I received this from David Durand who cannot currently post to XML-DEV. I have Davids permission to forward it to the list. [David Durand] > >In a nutshell: 16-bit character codes. Diaritics (accents, Vowel signs in >Devanagari, etc.) represented (preferably) as combining characters, although >some precombined characters are available for compatibility with old documents >and software (political concession). For compatiblity with ISO's 32(31?) bit >standard some escape sequences can include characters > 65537 (these are the >"Surrogate" characters). > >You need the book for the details of handling bidirectional text rendering, >word breaking, etc. etc. The scripts of the world cover a very wide space. > Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 17 12:08:19 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Character Encoding and the XML PR (was Re: PR.xml) In-Reply-To: <34C026F7.CB0BDC78@jclark.com> References: <3.0.1.16.19980116142105.3787196e@pop3.demon.co.uk> <199801161543.KAA01186@unready.microstar.com> <3.0.1.16.19980116160843.45e71988@pop3.demon.co.uk> <199801161638.LAA01795@unready.microstar.com> <34C026F7.CB0BDC78@jclark.com> Message-ID: <199801171206.HAA00267@unready.microstar.com> James Clark writes: > Are you saying that Java's 16-bit characters prevent complete support > for some of those encodings in an XML parser? If so, I don't see why, > since XML doesn't allow characters >= 0x110000, all legal XML characters > are representable in UTF-16 and hence in Java. Quite right, I wasn't connecting the two -- Java supports UCS-4 only to the extent allowed by surrogates in UTF-16, but that's the limit in XML as well, so there should be no problem (at least, not until Unicode starts assigning codes >= 0x110000, in which case the problem will be both Java's and XML's). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 17 12:21:40 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) In-Reply-To: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> References: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> Message-ID: <199801171219.HAA00320@unready.microstar.com> Jeremie Miller writes: > A well-formed XML document is not required to have a DTD, internal or > external, correct? Is a well-formed parser not an XML parser that does not > have access to or does not process a DTD, internal or external? I guess I > haven't found a clear definition of what a well-formed parser is yet. The PR is not very clear about processing requirements (other than error reporting and a few details like ignorable whitespace). As I understand things, however, a well-formed parser must be able to do the following: 1) Parse all of the grammar, including the document type declaration and internal DTD subset, without throwing spurious errors (even if it does nothing with the declarations). 2) Act correctly on the rmd parameter of the xml declaration. 3) Report a large range of errors, such as "]]>" in character data, "<" in an attribute value literal, illegal characters in element and attribute names, mismatched start- and end-tags, etc. There is no provision for a conforming XML parser that does not do full error reporting, even if the parser correctly handles all XML constructions. For example, AElfred parses a DTD, resolves all general and parameter entities, stores information on entities and notations, fills in defaulted attribute values, marks ignorable whitespace, and supports multiple character encodings, but it is a non-conforming XML parser because it does not report all required well-formedness errors. > If this is true, then a well-formed parser doesn't even have to acknowledge > that entities exist except for the built in ones, and absolutely all > whitespace is preserved, right? Yes, that is my understanding, except that the well-formed parser must check that the entity reference itself is well-formed. For example, if you found &1front2; you would be required to report a well-formedness error. You have to be prepared to check the whole range of Unicode characters, not just the first 256 (see the PR for what's allowed at the start and middle of a name). AElfred does not do this right now, because it would make the parser too large for use in applets (I added the support experimentally once, then removed it again). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 17 12:29:14 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: AElfred and External Entities (was Re: Partial XML Processors) and Questions) In-Reply-To: <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> References: <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> Message-ID: <199801171227.HAA00359@unready.microstar.com> Peter Murray-Rust writes: > Note that Lark and AElfred both throw errors for > > if bar.dtd cannot be found. This is reasonable (though frustrating) since > bar.dtd can alter the information in the document. Actually, AElfred gives you a hook for this: if you set the resolveEntity handler to return null for "bar.dtd", AElfred will not attempt to resolve the URL. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 17 14:31:19 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:56 2004 Subject: AElfred and External Entities (was Re: Partial XML Processors) and Questions) In-Reply-To: <199801171227.HAA00359@unready.microstar.com> References: <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> <007101bd231a$f856c580$2801a8c0@jeremie.dbqglass.com> <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980117142213.2cafd830@pop3.demon.co.uk> At 07:27 17/01/98 -0500, David Megginson wrote: >Peter Murray-Rust writes: > > > Note that Lark and AElfred both throw errors for > > > > if bar.dtd cannot be found. This is reasonable (though frustrating) since > > bar.dtd can alter the information in the document. > >Actually, AElfred gives you a hook for this: if you set the >resolveEntity handler to return null for "bar.dtd", AElfred will not >attempt to resolve the URL. Excellent. This is another example of an area where consistency amongst parser writers will be very useful. P. > > >All the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremie at netins.net Sat Jan 17 15:44:17 1998 From: jeremie at netins.net (Jeremie Miller) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) Message-ID: <004e01bd235e$9da3d620$2801a8c0@jeremie.dbqglass.com> > [great discussion about well-formed parser and ignoring DTD stuff completely] I guess I'm not the only one who is unsure about this. I would like to think that based on the mentioned goal 4 of XML: "It shall be easy to write programs which process XML documents" that the true spirit of _only_ a well-formed parser is the requirements that it can process a well-formed document(and the proper error reporting which I'm still working on). Now yes, *all* XML documents are well-formed, but the DTD stuff is extra information for a validating parser. If possible, it would be great for all well-formed parsers to at least process internal DTD information, but that is still an activity beyond well-formed XML documents. Again, this is all just my opinion, so if it needs changing please let me know :) What this stems from is the fact that my JavaScript XML parser has no _fundamental_ way of accessing URL's or files, and therefore no way of obtaining _anything_ external to the XML string it was given. So I am trying to do the best I can with that. The point I'm at is trying to figure out if I need to actually process internal DTD information, for now I just ignore it. As it stands, my most immediate goal is simply to correctly process XML fragments, since it is beyond the functionality of JavaScript to be a full-blown XML parser for various reasons(encoding, file access, etc...). Would it be worthwhile writing up a *simple* document clearly identifying all of the requirements for a well-formed parser and a validating parser, such as "It has to be able to handle CDATA, rmd, DTD's, recognize entities, and throw errors for X, X and X, etc..."? Or am I the only one who doesn't see _clearly_ the requirements and differences as defined in the XML spec? Jer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 17 16:00:15 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) In-Reply-To: <004e01bd235e$9da3d620$2801a8c0@jeremie.dbqglass.com> References: <004e01bd235e$9da3d620$2801a8c0@jeremie.dbqglass.com> Message-ID: <199801171558.KAA01422@unready.microstar.com> Jeremie Miller writes: > If possible, it would be great for all > well-formed parsers to at least process internal DTD information, but that > is still an activity beyond well-formed XML documents. Again, this is all > just my opinion, so if it needs changing please let me know :) I don't think that you have to process any of the DTD information, but you do need to be able to parse it well enough to skip it. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Jan 17 16:03:46 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) Message-ID: <3.0.32.19980117080127.00ac6994@pop.intergate.bc.ca> At 09:43 AM 17/01/98 -0600, Jeremie Miller wrote: >> [great discussion about well-formed parser and ignoring DTD stuff [uh, the discussion would be a bit better if some participants would read the XML PR; for example, there's no more RMD, and it is made very clear what a non-validating parser is required to do] >What this stems from is the fact that my JavaScript XML parser has no >_fundamental_ way of accessing URL's or files, and therefore no way of >obtaining _anything_ external to the XML string it was given. This is fine. Under no circumstances is a non-validating parser ever required to fetch external objects. *BUT*, you have to parse way through the internal subset, and you have to do internal entities and default attributes declared therein. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Jan 17 16:26:53 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors Message-ID: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> I apologize in advance for being somewhat acerbic. I think that there are areas that the PR could be more clear, in particular what gets passed to the app, but what is required by way of DTD handling is pretty crystal clear, and blatantly incorrect statements being presented as facts at this point in history is dangerous and rather irritating. -Tim At 11:10 AM 17/01/98, Peter Murray-Rust wrote: >One design goal (4 in spec) is that it should be "easy to write programs >which process XML documents". If that is interpreted that it is "easy to >write software that processes *all* XML documents, throwing errors wherever >one is required", then that goal is already lost. For example, James Clark >has come up with about 140 carefully incorrect XML documents ... and both James' processor and Lark detect all 164 errors, modulo to-be-fixed ambiguities on weird boundary conditions. I will be astounded if, in the not-too-distant future, due to input from Microsoft and Netscape, every desktop doesn't come with a couple of fully conformant XML processors built-in. Yes, I agree that we didn't do as well on that design goal as I would have liked; but the empirical fact is that the software is already there. >However, I think there will be domains where the full functionality (or at >least the full syntax) of XML will not be used. In that case there will be >"simple tools" that process XML documents. Not *all* XML documents, but a >lot. If there are widely-available fully-conformant processors which are already there in the browser and OS, why would you want to use a "simple tool" which will fail to accept conformant documents? Seems like a way to lose customers, to me. > It seems to me reasonable that these tools can tell the user if they >can't process a document. It seems highly unreasonable to me; if I create a legal XML document in my nice Frame or Arbortext or SoftQuad software, and send it to you, and you say "oooh icky, that's too complicated for poor little me" you can expect vehement and sincere complaints. >But I suspect there will be a number of tools which don't support the whole >spec I doubt it. Ooops, clarification, there will be tons of tools which don't validate. But when it is the case that both major browsers accept all conformant documents and turf non-WF docs, then there will be de facto a culture that will be intolerant of broken tools. Thank goodness. > We have frequently talked about the Desperate Perl Hacker >writing tools which are sufficient to process a class of XML documents, but >not all. Yes, but they don't claim to be XML processors. And that's just fine. >A Document + DTD + request to validate document. Requires a validating parser. Right. >B Document + full DTD but no request to validate. Right. We assume this document is WF, right? >C Document + parts of a DTD (e.g. a few ELEMENTs and ATTLISTs, maybe an >external subset which covers some of the ELEMENTs in the document). If no request to validate, the fact of missing D Document with no internal or external subset. Can only be well-formed. Right. >What the difference between A and B is is not clear to me. Only the request to validate. Lots of WF docs will in fact be valid, but be called WF simply because some app has no need to validate. >Note that Lark and AElfred both throw errors for > >if bar.dtd cannot be found. No. If you do lark.processEternalEntities(false) then it won't try to fetch the DTD. (Since "file:" URL's are in general a pool of blood on Microsoft operating systems, I recommend doing this most of the time). >C is similar to B, but validation is not possible. It is *essential* that >if ATTLISTs and ENTITYs (and NOTATION) exist, then the information in them >MUST be applied to the document. No. The spec is clear; a non-validating processor is required to do internal entities and default attribute values. Nobody should expect one to do anything with notations or unparsed entities or anything else. You want that, get a validating processor. >*IFF* an ENTITY is declared (case C), the parser MUST process it. If it's a non-validating processor, this is only true for *internal* entities. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremie at netins.net Sat Jan 17 17:39:13 1998 From: jeremie at netins.net (Jeremie Miller) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors Message-ID: <000e01bd236e$a9f321a0$2801a8c0@jeremie.dbqglass.com> > [some content removed from Tim Bray] > >If there are widely-available fully-conformant processors which are >already there in the browser and OS, why would you want to use a >"simple tool" which will fail to accept conformant documents? Seems >like a way to lose customers, to me. Excellent point, but we're not quite there yet :) The only reason I wrote a JavaScript XML parser even with the existance of conformant java based parsers available, is simply because I wanted extremely simple way to immediately access the contents of any XML fragment via JavaScript. Once XML parsers are built into the browser and OS, it would be foolish to not use them. Until fully-conformant processors are widely available, I suspect we'll be seeing lots of "smaller" processors in use, mostly as a bridge to reach the point when it's a non-issue. Speaking of XML processors in operating systems, would it be apropriate to have a libXML shared library for Unix systems? If so, is this being done? > [more content removed] > >No. The spec is clear; a non-validating processor is required to >do internal entities and default attribute values. Nobody should >expect one to do anything with notations or unparsed entities or >anything else. You want that, get a validating processor. > I think that's clear enough for me, but as you mentioned, after some time, an _only_ non-validating processor shouldn't exist, as validating processors should be in wide use and will act in a non-validating mode when a DTD is not available. Jer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Jan 17 17:57:27 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) References: <3.0.1.16.19980117111002.302f89de@pop3.demon.co.uk> Message-ID: <34C0E011.D915BA3A@technologist.com> Peter Murray-Rust wrote: > > It's common for compilers to say "sorry, this > expression is just too complicated for me to deal with - you'll have to > break it up a bit". Compilers for which languages? I mean I'm familiar with error messages like "This compiler doesn't support this language feature *yet*" and "this tool does not support that optional langauge feature" but a message like "this tool chooses not to support a required feature" is quite foreign to me. Such a tool simply has a bug in it. > The WG has (I think rightly) said that there should not be conformance > levels in XML. [For those not familiar with SGML, there are a large number > of different options, many of which are not supported by many parsers.] > But I suspect there will be a number of tools which don't support the whole > spec - this is a neutral statement. And there will be a number of documents > that don't use the whole functionality of XML - this is also a neutral > statement. We have frequently talked about the Desperate Perl Hacker > writing tools which are sufficient to process a class of XML documents, but > not all. > I can see convergence between these activities. The Desperate Perl Hacker is not trying to get at the structure of the XML document. They are working with it as a more or less "flat" text file. They are not trying to create an XML processor in the PR sense. Let's call what they do "XML massaging". In most cases, these tools will not even work with documents conforming to DTDs that they are not familiar with. The described JavaScript processor is in a very different range of tool. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Jan 17 17:58:44 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) References: <004e01bd235e$9da3d620$2801a8c0@jeremie.dbqglass.com> <199801171558.KAA01422@unready.microstar.com> Message-ID: <34C0E09C.6F54E150@technologist.com> David Megginson wrote: > > Jeremie Miller writes: > > > If possible, it would be great for all > > well-formed parsers to at least process internal DTD information, but that > > is still an activity beyond well-formed XML documents. Again, this is all > > just my opinion, so if it needs changing please let me know :) > > I don't think that you have to process any of the DTD information, but > you do need to be able to parse it well enough to skip it. You must also verify the subset's well-formedness. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Sun Jan 18 01:08:30 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <000e01bd236e$a9f321a0$2801a8c0@jeremie.dbqglass.com> Message-ID: <34C155EC.8D1C68A@allette.com.au> Jeremie Miller wrote: > The only reason I wrote a JavaScript XML parser even with the existance of > conformant java based parsers available, is simply because I wanted extremely > simple way to immediately access the contents of any XML fragment via > JavaScript. Once XML parsers are built into the browser and OS, it would be > foolish to not use them. I think there is a place for a process such as you have described and I agree that once conformant parsers are more readily available, you would probably want to utilise one. Discussion has tended to concentrate on either the 'conformant parser' scenario, or the 'desperate perl hacker' - perhaps we just ignored the almost inevitable onset of tools falling between these extremes. The development of such tools should be encouraged, as long as their creators realise that in the end, they will have to either develop up to fully conformant or probably face relegation, unless the tool is only designed for a narrow and predictable band of (dph-type) activity. Peter Murray-Rust wrote: > The WG has (I think rightly) said that there should not be conformance levels > in XML. [For those not familiar with SGML, there are a large number of > different options, many of which are not supported by many parsers.] There are a number of areas in SGML where parsers either don't support certain features (or worse yet, don't support them consistently), but they're typically restricted to a high level set (subdoc, rank, etc.), so can be regarded as as part of the superset that doesn't impact on XML. I also agree with the sentiment that there shouldn't be conformance levels in XML, but I have always read that to mean that an XML parser either conforms or it doesn't. Given the relative simplicity of writing an XML parser (as compared to SGML), I think that's reasonable. It has been generally accepted that there is a place for non-conformant tools (ala dph); perhaps we didn't anticipate them taking a form so close to the real thing. Despite rapid maturing, this is still a new market - there are plenty of applications still to be written and thrown away. Frenetic activity can only be good for the market, as long as everybody's aware of the boundaries. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 18 01:58:56 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors In-Reply-To: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> At 08:25 17/01/98 -0800, Tim Bray wrote: >I apologize in advance for being somewhat acerbic. I think that >there are areas that the PR could be more clear, in particular what >gets passed to the app, but what is required by way of DTD handling >is pretty crystal clear, and blatantly incorrect statements being presented >as facts at this point in history is dangerous and rather irritating. -Tim I apologise if I have made incorrect statements - I do read the spec and I tried to choose my words carefully and I also don't like upsetting people. > >At 11:10 AM 17/01/98, Peter Murray-Rust wrote: > >>One design goal (4 in spec) is that it should be "easy to write programs >>which process XML documents". If that is interpreted that it is "easy to I suspect that the word 'process' has caused some confusion. My reading is 'software that does something useful to some subset of users' whereas others have taken this to mean that any software that processes XML documents is a 'processor' in the words of the spec. My contribution was intended to address those people who were building processing software (NOT processors) and had designed them so that they had a function more limited than a full processor. > >If there are widely-available fully-conformant processors which are >already there in the browser and OS, why would you want to use a >"simple tool" which will fail to accept conformant documents? Seems >like a way to lose customers, to me. Not all XML applications will wish to use browsers - they may wish to call parsing functionality from C programs, UNIX shells and other places. I agree wholeheartedly that if XML libraries are universally available then there shouldn't be a problem. That is one reason why I'm keen to see SAX available in other languages than Java. However I have many colleagues who still use FORTRAN and other languages where I suspect it will be some time before e a set of XML libraries become available. > >It seems highly unreasonable to me; if I create a legal XML document >in my nice Frame or Arbortext or SoftQuad software, and send it to you, >and you say "oooh icky, that's too complicated for poor little me" you >can expect vehement and sincere complaints. Perhaps my experience has been clouded by early exposure to C++, but it was extremely common there to find that different compilers had different functionality. If this is a non-problem for XML I rejoice. [...] > >> We have frequently talked about the Desperate Perl Hacker >>writing tools which are sufficient to process a class of XML documents, but >>not all. > >Yes, but they don't claim to be XML processors. And that's just fine. I did not claim that any of the software that I was talking about was an "XML processor". I talked about software that "processed XML documents". [I think the use of the term "processor" is confusing, as I believe that is it possible to process XML documents without using a "processor" ] If goal 4 actually means "all software that acts upon XML documents must be a bona fide XML processor" I would take issue. So I shall have to use a phrase like "act upon" if "process" has a specific meaning. P. > >>A Document + DTD + request to validate document. Requires a validating parser. > >Right. > >>B Document + full DTD but no request to validate. > >Right. We assume this document is WF, right? > >>C Document + parts of a DTD (e.g. a few ELEMENTs and ATTLISTs, maybe an >>external subset which covers some of the ELEMENTs in the document). > >If no request to validate, the fact of missing is not required to have any effect, and applications must not depend >on any behavior contingent on the processing of an >>D Document with no internal or external subset. Can only be well-formed. > >Right. > >>What the difference between A and B is is not clear to me. > >Only the request to validate. Lots of WF docs will in fact be valid, >but be called WF simply because some app has no need to validate. > >>Note that Lark and AElfred both throw errors for >> >>if bar.dtd cannot be found. > >No. If you do lark.processEternalEntities(false) then it won't >try to fetch the DTD. (Since "file:" URL's are in general a pool of >blood on Microsoft operating systems, I recommend doing this most >of the time). > >>C is similar to B, but validation is not possible. It is *essential* that >>if ATTLISTs and ENTITYs (and NOTATION) exist, then the information in them >>MUST be applied to the document. > >No. The spec is clear; a non-validating processor is required to >do internal entities and default attribute values. Nobody should >expect one to do anything with notations or unparsed entities or >anything else. You want that, get a validating processor. > >>*IFF* an ENTITY is declared (case C), the parser MUST process it. > >If it's a non-validating processor, this is only true for *internal* >entities. > > -Tim > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 18 09:45:08 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> Message-ID: <34C1CF54.46B00B1F@technologist.com> Peter Murray-Rust wrote: > > Not all XML applications will wish to use browsers - they may wish to call > parsing functionality from C programs, UNIX shells and other places. I > agree wholeheartedly that if XML libraries are universally available then > there shouldn't be a problem. That is one reason why I'm keen to see SAX > available in other languages than Java. However I have many colleagues > who still use FORTRAN and other languages where I suspect it will be some > time before e a set of XML libraries become available. Hopefully most fortran compilers will be able to link to C libraries. Another alternative is to pipe the data through a normalizer as we do for full SGML. Presumably even in Fortan a parser for normalized XML will not take more than two days to write. > Perhaps my experience has been clouded by early exposure to C++, but it > was extremely common there to find that different compilers had different > functionality. If this is a non-problem for XML I rejoice. C++ became a standard only within the last few months. Everything that was labelled C++ up to then was a valiant attempt to track a moving target of epic complexity. Even now C++ compilers have wildly divergent feature sets because implementing one is so hard that it takes years to get it right (and the requisite years have not yet passed for some compilers). The important thing to note is that what we have in the meantime are "not right" (in other words, not true C++ compilers). Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Jan 18 12:28:26 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors In-Reply-To: <34C1CF54.46B00B1F@technologist.com> References: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980118122247.1fff0990@pop3.demon.co.uk> OK - let's have another go - I really am trying to be constructive, rather than argumentative. At 04:45 18/01/98 -0500, Paul Prescod wrote: >Peter Murray-Rust wrote: >> >> Not all XML applications will wish to use browsers - they may wish to call >> parsing functionality from C programs, UNIX shells and other places. I >> agree wholeheartedly that if XML libraries are universally available then >> there shouldn't be a problem. That is one reason why I'm keen to see SAX >> available in other languages than Java. However I have many colleagues >> who still use FORTRAN and other languages where I suspect it will be some >> time before e a set of XML libraries become available. > >Hopefully most fortran compilers will be able to link to C libraries. Exactly. And I hope that the community is able to develop them. [I am sure all the functionality is present already in SP, but I confess that as a novice to SGML I didn't find it easy to find my way around when I first looked at it. Treat that as a reflection on me.] >Another alternative is to pipe the data through a normalizer as we do >for full SGML. Presumably even in Fortan a parser for normalized XML >will not take more than two days to write. Exactly. IMO this is one of the attractions of SAX-C. And this is where I think we agree. It was precisely this normalised aspect of XML I was addressing. [I should make it clear that this is not hypothetical - I am confident that some sections of the molecular community will adopt XML, but only up to a certain level at the beginning.] A typical example of a WF, normalised, XML file might look like: O H H O1 H2 O1 H3 Doe Essentially such a file is a subset of the ESIS information (no attribute typing, no entities, no notation) and uses no CDATA or entity references. It is my contention that there will be many people (some will be DPHs) who will be quite happy to create XML files no more sophisticated than this and will want *tools* to *operate on* them. [I carefully avoid the use of the word "process" or "processor".] These tools may even be application-independent. I was simply making the case that in my opinion there is a role for such tools, and that it is perfectly reasonable for such tools to say "I can operate on certain types of XML file. If I come across a more complicated one, I'll abort and tell you." > >> Perhaps my experience has been clouded by early exposure to C++, but it >> was extremely common there to find that different compilers had different >> functionality. If this is a non-problem for XML I rejoice. > >C++ became a standard only within the last few months. Everything that >was labelled C++ up to then was a valiant attempt to track a moving >target of epic complexity. Even now C++ compilers have wildly divergent >feature sets because implementing one is so hard that it takes years to >get it right (and the requisite years have not yet passed for some >compilers). The important thing to note is that what we have in the >meantime are "not right" (in other words, not true C++ compilers). Agreed. I am reassured that the creation of standard tools for XML, XLL, XSL and so on will be much less arduous. > > Paul Prescod >-- >"You have the wrong number." >"Eh? Isn't that the Odeon?" >"No, this is the Great Theater of Life. Admission is free, but the >taxation is mortal. You come when you can, and leave when you must. The >show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Jan 18 19:20:11 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> <3.0.1.16.19980118122247.1fff0990@pop3.demon.co.uk> Message-ID: <34C255B7.69F4EC@technologist.com> Peter Murray-Rust wrote: > Exactly. And I hope that the community is able to develop them. [I am sure > all the functionality is present already in SP, but I confess that as a > novice to SGML I didn't find it easy to find my way around when I first > looked at it. Treat that as a reflection on me.] I believe that James Clark has already done most of this work in his XML tokenizer (which is distinct from SP). I think that we have different ideas about what normalized will look like. This is what you are thinking of: > > > O > H > H > O1 H2 > O1 H3 > Doe > This is what I am thinking of: O H< /ATOM> H O1 H2 O1 H3 Doe In other words, I am thinking about a subset of XML so simple that it is trivial to parse and so annoying that no human being would ever want to type it directly except for testing out their "reader". I would explicitly disallow the magical incantation to discourage people from piping in ordinary XML documents (and thus from thinking that this reader is making any attempt to be an XML processor). > Essentially such a file is a subset of the ESIS information (no attribute > typing, no entities, no notation) and uses no CDATA or entity references. > It is my contention that there will be many people (some will be DPHs) who > will be quite happy to create XML files no more sophisticated than this and > will want *tools* to *operate on* them. Right, I don't think that these tools should be constructed except as a stopgap. There is no good reason that these tools should not support all of XML. When people write these simple XML documents and find that their tools will not support more, they will inevitably get confused (just as most people do with C++) about exactly what XML *is*. I proposed a processor in Fortran that only accepts the output of a normalizer, but I do not think that it should not be billed as an XML processor, any more than a Fortran program that accepts ESIS would be called an SGML parser. The documentation should says: "This Fortran program accepts the output of xmlnorm" and leave it at that. In other words, xmlnorm becomes an implicit component in the system. Given these options, I'm not sure why users should accept any tools that claim partial support for XML...to put it another way, human beings should never have to worry about the limitations of their tools when they are typing XML. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Jan 19 00:00:54 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:56 2004 Subject: Partial XML Processors (was Re: JavaScript parser update and Questions) Message-ID: <01bd246d$5f3cd7a0$0f11e391@mhklaptop.bra01.icl.co.uk> Peter Murray-Rust wrote: >One design goal (4 in spec) is that it should be "easy to write programs >which process XML documents". If that is interpreted that it is "easy to >write software that processes *all* XML documents, throwing errors wherever >one is required", then that goal is already lost. For example, James Clark >has come up with about 140 carefully incorrect XML documents for testing >parsers. DavidM has said that AElfred spots 80% of them, but that the other >20% would increase AElfred's size and decrease its speed. To be pedantic, this shows only that it is difficult to write software that correctly processes all non-XML documents. For a great many purposes, I don't care what the software does with a non-XML document because I don't intend to generate such things in the first place. But I do think it is important that correct XML documents are processed correctly. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 19 00:52:57 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> Message-ID: <34C1C1B3.A8F798D7@jclark.com> Tim Bray wrote: > No. The spec is clear; a non-validating processor is required to > do internal entities and default attribute values. Nobody should > expect one to do anything with notations or unparsed entities or > anything else. You want that, get a validating processor. I think I would expect a non-validating processor to inform the application about any unparsed entities that have been declared in the internal subset. A non-validating processor has to keep track of this, because it has to report an error if any of them are referenced. I don't see that passing this information on to the application has anything to do with validation. The only things I expect of a validating processor that I don't expect of a non-validating processor are: - processing the external DTD subset - processing parameter entity references - reporting violations of validity constraints - reporting that character data is ignorable white space James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 19 01:36:13 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors In-Reply-To: <34C1C1B3.A8F798D7@jclark.com> References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> Message-ID: <199801190134.UAA00339@unready.microstar.com> James Clark writes: > The only things I expect of a validating processor that I don't expect > of a non-validating processor are: > > - processing the external DTD subset > > - processing parameter entity references > > - reporting violations of validity constraints > > - reporting that character data is ignorable white space I think that James would have to add "processing external text entities" to that list, since the PR labels them as "included if validating," implying that non-validating parsers need not include them. In other words, to turn your list (with my addition) on its head, a conforming, non-validating (or well-formed) XML parser is required to support nearly everything in the PR, with only the following exceptions: [Information Returned] 1. It is not required to process the external DTD subset. 2. It is not required to process parameter entity references. 3. It is not required to report that character data is ignorable white space. 4. It is not required to process external text entities. [Error Reporting] 5. It is not required to report violations of validity constraints. The validity constraints -- that it is not required to report -- relate largely to the DTD, especially (but not exclusively) to attribute declarations and content models. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 19 03:48:33 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> <199801190134.UAA00339@unready.microstar.com> Message-ID: <34C2C7B7.A1BE8D41@jclark.com> David Megginson wrote: > I think that James would have to add "processing external text > entities" to that list, since the PR labels them as "included if > validating," implying that non-validating parsers need not include > them. Yes, you're right: the PR doesn't appear to require this. I am a bit surprised. I thought at one stage the spec said that the parser had to be able to do this if requested by the application/user, and I would expect this capability of any general purpose XML processor. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 19 14:01:50 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors In-Reply-To: <34C2C7B7.A1BE8D41@jclark.com> References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> <199801190134.UAA00339@unready.microstar.com> <34C2C7B7.A1BE8D41@jclark.com> Message-ID: <199801191359.IAA00400@unready.microstar.com> James Clark writes: > > I think that James would have to add "processing external text > > entities" to that list, since the PR labels them as "included if > > validating," implying that non-validating parsers need not include > > them. > > Yes, you're right: the PR doesn't appear to require this. I am a bit > surprised. I thought at one stage the spec said that the parser had to > be able to do this if requested by the application/user, and I would > expect this capability of any general purpose XML processor. This is really part of a broader issue. There is a set of XML features -- external text entities, NDATA entities, notations, and ID/IDREF -- that are absolutely basic for typical SGML documents. When people suggest that these are 'advanced' features in XML, then one of the following two statements must apply: 1) they expect that XML documents will not typically include internal cross-references; that they will not include graphics, sound, video, or other non-XML material; and that they will consist of only a single physical file; or 2) they believe that ID/IDREF, NDATA entities, notations, and external text entities are overly-complicated SGML relics left in to satisfy a few pedants on the WG, and that XML and XLL provide other, simpler mechanisms with the same functionality. Clarification from the WG would be helpful here. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 19 15:44:06 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> <199801190134.UAA00339@unready.microstar.com> <34C2C7B7.A1BE8D41@jclark.com> <199801191359.IAA00400@unready.microstar.com> Message-ID: <34C373EE.DE54BCCB@technologist.com> David Megginson wrote: > one of the following two statements must apply: > > 1) they expect that XML documents will not typically include internal > cross-references; that they will not include graphics, sound, > video, or other non-XML material; and that they will consist of > only a single physical file; or > > 2) they believe that ID/IDREF, NDATA entities, notations, and external > text entities are overly-complicated SGML relics left in to satisfy > a few pedants on the WG, and that XML and XLL provide other, simpler > mechanisms with the same functionality. > > Clarification from the WG would be helpful here. I don't think you can expect a coherent response from a diverse group of people on a family of features. Also, the former statement is ridiculous and the latter actually implies that the WG is divided on the issue. There is a third way to read the situation: the optionality of the features works to reassure people that XML processing is simple, but the usefulness of them will encourage users to request them. (idea: one easy way to encourage vendors to implement them is to depend upon them in XLL) For instance XLL depends on ID/IDREF. My personal feeling is: > external text entities, Absolutely vital. Expansion should be expected of all processors. XML-DEV members should apply pressure on processor vendors to handle this, despite the laxity of the PR. > NDATA entities, I suspect that the web market will be more amenable to embedded URLs. > notations, MIME/HTTP can handle this. > ID/IDREF Others seem to prefer to push this onto the application side. I don't feel strongly enough about it to argue with them. It is clearly a semantic restriction and there are various reasons that those are often better handled on the application side. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 19 16:12:53 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors In-Reply-To: <34C373EE.DE54BCCB@technologist.com> References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> <199801190134.UAA00339@unready.microstar.com> <34C2C7B7.A1BE8D41@jclark.com> <199801191359.IAA00400@unready.microstar.com> <34C373EE.DE54BCCB@technologist.com> Message-ID: <199801191609.LAA01328@unready.microstar.com> Paul Prescod writes: > There is a third way to read the situation: the optionality of the > features works to reassure people that XML processing is simple, but the > usefulness of them will encourage users to request them. (idea: one easy > way to encourage vendors to implement them is to depend upon them in > XLL) For instance XLL depends on ID/IDREF. This doesn't really address the point, though. If notations and data attributes are optional, then either support for embedding non-XML objects is also optional, or notations and data attributes are not the preferred way of embedding non-XML objects. If they are not the preferred way (you probably rightly suggest that embedded URLs and MIME/HTTP will be more popular), then why does the spec include them at all, and cause so much unnecessary confusion among non-SGML people? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Mon Jan 19 16:20:29 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:56 2004 Subject: Conformance in XML processors Message-ID: <01bd24f5$c6862730$a00b0ac0@pharcyde.poetsoftware.xo.com> David Megginson wrote: >This doesn't really address the point, though. If notations and data >attributes are optional, then either support for embedding non-XML >objects is also optional, or notations and data attributes are not the >preferred way of embedding non-XML objects. If they are not the >preferred way (you probably rightly suggest that embedded URLs and >MIME/HTTP will be more popular), then why does the spec include them >at all, and cause so much unnecessary confusion among non-SGML people? I wonder about this as well. The only explanation I have heard is that people should be forced to include a list of external entities in the document prolog, since they will regret it 20 years down the road if they don't. I must say, this sounds I tad prescriptive to me... Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Jan 19 17:27:26 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors In-Reply-To: <01bd24f5$c6862730$a00b0ac0@pharcyde.poetsoftware.xo.com> References: <01bd24f5$c6862730$a00b0ac0@pharcyde.poetsoftware.xo.com> Message-ID: <199801191724.MAA01638@unready.microstar.com> Matthew Gertner writes: > >This doesn't really address the point, though. If notations and data > >attributes are optional, then either support for embedding non-XML > >objects is also optional, or notations and data attributes are not the > >preferred way of embedding non-XML objects. If they are not the > >preferred way (you probably rightly suggest that embedded URLs and > >MIME/HTTP will be more popular), then why does the spec include them > >at all, and cause so much unnecessary confusion among non-SGML people? > > I wonder about this as well. The only explanation I have heard is that > people should be forced to include a list of external entities in the > document prolog, since they will regret it 20 years down the road if they > don't. I must say, this sounds I tad prescriptive to me... Actually, given the instability of links on the WWW, you will likely regret it 20 days down the road if you do not collect references in one place (preferably a separate physical file), but there are ways to do so using either approach. Personally, I am comfortable with NDATA entities and notations, but I get the impression that people are (informally) deprecating them. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Mon Jan 19 17:41:51 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors In-Reply-To: <98Jan18.203749est.18817@thicket.arbortext.com> References: <34C1C1B3.A8F798D7@jclark.com> <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> Message-ID: <3.0.5.32.19980119124143.00b1c5c0@village.doctools.com> At 08:34 PM 1/18/98 -0500, David Megginson wrote: ... >In other words, to turn your list (with my addition) on its head, a >conforming, non-validating (or well-formed) XML >parser is required to support nearly everything in the PR, with only >the following exceptions: ... I'm struck by the persistence of the usage of "well-formed processor" in this forum. (I realize David is merely quoting the familiar form for those who think of it this way...) Obviously, it's not very comfortable to name something (a "non-validating XML processor") by the *absence* of a behavior. Would it make sense to work the terminology as follows, for clarity's sake? o XML processor (or XML parser): A software component that parses and checks for well-formedness. The minimum of what a "non-validating XML processor" is supposed to do today. o XML validator: A software component that checks only for validity, using input from an XML processor. What a "validating XML processor" is supposed to do today. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dvp4c at jefferson.village.virginia.edu Mon Jan 19 17:59:59 1998 From: dvp4c at jefferson.village.virginia.edu (Daniel Pitti) Date: Mon Jun 7 16:59:57 2004 Subject: XSL In-Reply-To: <3.0.5.32.19980119124143.00b1c5c0@village.doctools.com> References: <98Jan18.203749est.18817@thicket.arbortext.com> <34C1C1B3.A8F798D7@jclark.com> <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> Message-ID: <3.0.1.32.19980119131208.006f74d8@jefferson.village.virginia.edu> Is this list a legitimate place to discuss MSXSL? Or is such being directed elsewhere? Daniel V. Pitti Project Director Institute for Advanced Technology in the Humanities Alderman Library University of Virginia Charlottesville, Virginia 22903 Phone: 804 924-6594 Fax: 804 982-2363 Email: dpitti@Virginia.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davidsch at microsoft.com Mon Jan 19 19:17:39 1998 From: davidsch at microsoft.com (David Schach) Date: Mon Jun 7 16:59:57 2004 Subject: XSL Message-ID: <5CEA8663F24DD111A96100805FFE658702FCCB0B@red-msg-51.dns.microsoft.com> > Is this list a legitimate place to discuss MSXSL? Or is such being > directed > elsewhere? > Yes. You can also send feedback to msxsl@microsoft.com. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davidsch at microsoft.com Mon Jan 19 19:19:47 1998 From: davidsch at microsoft.com (David Schach) Date: Mon Jun 7 16:59:57 2004 Subject: XSL Message-ID: <5CEA8663F24DD111A96100805FFE658702FCCB1A@red-msg-51.dns.microsoft.com> > Is this list a legitimate place to discuss MSXSL? Or is such being > directed > elsewhere? > Yes, but you can also send feedback to msxsl@microsoft.com. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Jan 19 20:12:22 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors References: <3.0.32.19980117082505.00ae332c@pop.intergate.bc.ca> <34C1C1B3.A8F798D7@jclark.com> <199801190134.UAA00339@unready.microstar.com> <34C2C7B7.A1BE8D41@jclark.com> <199801191359.IAA00400@unready.microstar.com> <34C373EE.DE54BCCB@technologist.com> <199801191609.LAA01328@unready.microstar.com> Message-ID: <34C3B3BF.CD918BF5@technologist.com> David Megginson wrote: > > This doesn't really address the point, though. If notations and data > attributes are optional, then either support for embedding non-XML > objects is also optional, or notations and data attributes are not the > preferred way of embedding non-XML objects. The former is officially the case. After all, XML presents no other mechanism for embedding non-XML objects. XLL does not yet exists, and when it does, it will not be part of XML and is thus optional *by definition*. So object embedding is definately optional. OTOH, I am confident that the latter is also a factor (probably the dominant factor) in some people's minds. I think that these features are optional for all of the usual reasons features are made optional in any language: a) not everyone needs them b) some people say they do need them c) some people don't want to implement them d) we aren't confident that they are actually appropriate e) we aren't confident that they aren't. Optionality gives the market a chance to decide. Of course XML's design documents say it shouldn't have optional features, but IMHO that criteria was shot when the distinction between well-formedness parsers and validating parsers was invented. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthewg at poet.de Mon Jan 19 21:00:17 1998 From: matthewg at poet.de (Matthew Gertner) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors Message-ID: <01bd2518$73aaad10$0100007f@pharcyde.poetsoftware.xo.com> David Megginson wrote: >Actually, given the instability of links on the WWW, you will likely >regret it 20 days down the road if you do not collect references in >one place (preferably a separate physical file), but there are ways to >do so using either approach. Personally, I am comfortable with NDATA >entities and notations, but I get the impression that people are >(informally) deprecating them. Perhaps I exaggerated somewhat. ;-) In any case, I think your conclusion is right: there is some merit to keeping references in one place, particularly for documents of a non-transient nature, but people simply don't do it in most cases. It would be trivial to create tags for "reference aliases" if this is desired, so why do we need another syntax? Anyway, I don't see this changing, so I guess we will have to live with two ways of doing the same thing. Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Tue Jan 20 11:14:29 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors Message-ID: <01bd2594$a4cc93a0$1e09e391@mhklaptop.bra01.icl.co.uk> Eve Maler wrote: >I'm struck by the persistence of the usage of "well-formed processor" in >this forum. (I realize David is merely quoting the familiar form for those >who think of it this way...) Obviously, it's not very comfortable to name >something (a "non-validating XML processor") by the *absence* of a behavior. > >Would it make sense to work the terminology as follows, for clarity's sake? > >o XML processor (or XML parser): A software component that parses and >checks for well-formedness. The minimum of what a "non-validating XML >processor" is supposed to do today. > >o XML validator: A software component that checks only for validity, using >input from an XML processor. What a "validating XML processor" is supposed >to do today. > Given that there are three kinds of document: - incorrect (E) - well-formed but invalid (W) - well-formed and valid (V) I can identify at least the following kinds of program: - accepts all W and V, rejects all E ("XML syntax parser") - accepts all V, rejects all E and W ("XML validating parser") - accepts all W and V, rejects some E ("liberal XML syntax parser") - accepts all V, rejects some E and W ("liberal XML validating parser") - accepts some W and V but rejects others ("subset XML parser") A useful subcategory of "subset XML parser" is one that accepts all W and/or V documents that use a particular DTD (and has undefined behaviour on those that do not). We could call this an "XML application". Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Tue Jan 20 11:27:57 1998 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 16:59:57 2004 Subject: ADMIN: xml list, Lyris alternative to Majordomo. Message-ID: Dear all, As you might have noticed, list admin software needs to be increasingly sophisticated, to cope with the problems that might occur, whether accidental, or deliberately malicious. We have a new listserver called Lyris (http://www.lyris.com/) running in trial mode, which might allow a better service for xml-dev. If anyone has any experiences with this listserver which you feel we might benefit from, please send either to me or to Martyn Hampson (m.hampson@ic.ac.uk). In particular, we would be interested in the migration protocal from say Majordomo to Lyris, and any difficulties in that area. Many thanks. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Arjan.Loeffen at let.ruu.nl Tue Jan 20 12:37:07 1998 From: Arjan.Loeffen at let.ruu.nl (Arjan Loeffen) Date: Mon Jun 7 16:59:57 2004 Subject: white space Message-ID: <34C48DAE.B9AA3F19@let.ruu.nl> Dear reader, am I right in assuming that white space in element content should be ignored when XML validation is required (whatever the setting for xml:space), and that the xml:space attribute applies to whitespace in mixed content when validating, and to *all* whitespace when not validating (i.e. testing well-formedness only)? Thanks in advance, Arjan. -- Arjan Loeffen Computer & Arts, Faculty of Arts, Utrecht University Arjan.Loeffen@let.ruu.nl http://CandL.let.ruu.nl xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Tue Jan 20 12:54:41 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 16:59:57 2004 Subject: white space In-Reply-To: <34C48DAE.B9AA3F19@let.ruu.nl> Message-ID: <9801201254.AA03341@lute.apsdc.ksp.fujixerox.co.jp> Arjan Loeffen writes: >am I right in assuming that white space in element content should be >ignored when XML validation is required (whatever the setting for >xml:space), and that the xml:space attribute applies to whitespace in >mixed content when validating, and to *all* whitespace when not >validating (i.e. testing well-formedness only)? Actually, white space in element content is always passed to applications. xml:space has nothing to do with the behaviour of processors. xml:space merely provides a hint to applications. The word "significant" in the third para of 2.10 of the XML PR has to be changed. Validating processors must distinguish white space in element content from other white space and must signal to the application that white space appeared in element content. Applications will probably ignore such white space, but XPointer will not. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From vortura at yahoo.com Tue Jan 20 16:56:25 1998 From: vortura at yahoo.com (Tim Keefe) Date: Mon Jun 7 16:59:57 2004 Subject: XML Help Message-ID: <19980120165556.17488.rocketmail@send1b.yahoomail.com> Hi everyone, I currently code in SGML (although I haven't written DTD's or anything like that) and have done so for years and I also know HTML but not as well as SGML. So, I have a general question, how does everyone suggest that I get in on the ground floor of using XML? It's a bit hard to keep up on where things are at with all of this, so does anyone have any suggestions for being one of the first people to really use XML? Any and all help would be greatly appreciated. For once I'd like to get in on the beginning of something. Thanks _________________________________________________________ DO YOU YAHOO!? Get your free @yahoo.com address at http://mail.yahoo.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Jan 20 21:28:13 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:57 2004 Subject: VRML and XML References: <01bd2518$73aaad10$0100007f@pharcyde.poetsoftware.xo.com> Message-ID: <34C516A0.33A6@hiwaay.net> Something for the DOM experts to chew on: Does DOM cover or consider requirements for multi-user (simultaneous users) document objects? VRML groups (eg Living Worlds) have some interesting proposals as to requirements for multi-user worlds. These may be generalizable. I am thinking in terms of collaborative apps here where pilot/drone pairs are manipulating a shared description. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Jan 20 22:27:05 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors In-Reply-To: <34C255B7.69F4EC@technologist.com> References: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> <3.0.1.16.19980118122247.1fff0990@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980118231824.22ffce06@pop3.demon.co.uk> Thanks Paul - you have put it very clearly and it sounds exactly what I was after. At 14:19 18/01/98 -0500, Paul Prescod wrote: >Peter Murray-Rust wrote: >> Exactly. And I hope that the community is able to develop them. [I am sure >> all the functionality is present already in SP, but I confess that as a >> novice to SGML I didn't find it easy to find my way around when I first >> looked at it. Treat that as a reflection on me.] > >I believe that James Clark has already done most of this work in his XML >tokenizer (which is distinct from SP). Better and better. > >I think that we have different ideas about what normalized will look >like. This is what you are thinking of: > >> >> >> O >> H >> H >> O1 H2 >> O1 H3 >> Doe >> > >This is what I am thinking of: > > > >O > > >H< >/ATOM> > >H > > >O1 H2 > > >O1 H3 > > >Doe > > I am happier with yours :-) [You seem to have newlines in some tags and not others, is this intended?] > >In other words, I am thinking about a subset of XML so simple that it is >trivial to parse and so annoying that no human being would ever want to >type it directly except for testing out their "reader". I would Exactly. Most of the stuff I am concerned about will be generated by tools. >explicitly disallow the magical incantation to discourage people from >piping in ordinary XML documents (and thus from thinking that this >reader is making any attempt to be an XML processor). > >> Essentially such a file is a subset of the ESIS information (no attribute >> typing, no entities, no notation) and uses no CDATA or entity references. >> It is my contention that there will be many people (some will be DPHs) who >> will be quite happy to create XML files no more sophisticated than this and >> will want *tools* to *operate on* them. > >Right, I don't think that these tools should be constructed except as a >stopgap. There is no good reason that these tools should not support all >of XML. When people write these simple XML documents and find that their >tools will not support more, they will inevitably get confused (just as >most people do with C++) about exactly what XML *is*. The only reason - and it's probably not "good" - is that the effort to create or install a solution is too great for the problem at hand. And it costs money and time. > >I proposed a processor in Fortran that only accepts the output of a >normalizer, but I do not think that it should not be billed as an XML >processor, any more than a Fortran program that accepts ESIS would be >called an SGML parser. The documentation should says: "This Fortran >program accepts the output of xmlnorm" and leave it at that. In other >words, xmlnorm becomes an implicit component in the system. Yes - I like this. Is your use of 'xmlnorm' fictitious, or is such a beast emerging from the current tools. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Jan 20 22:37:26 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:59:57 2004 Subject: VRML and XML References: <01bd2518$73aaad10$0100007f@pharcyde.poetsoftware.xo.com> <34C516A0.33A6@hiwaay.net> Message-ID: <34C526D8.3084@hiwaay.net> Another area that XML is eminently suitable for in the VRML community is in the avatar specification, Universal Avatars (distinct from H-Anim). This proposal requires a User Profile instance to be referenced by URL. Last I looked, the user profile was in HTML although the developers question the suitability of HTML for this and for good reasons since in HTML, this is primarily a set of value pairs enclosed in comments. This is an ideal application for XML. There are lots of these small descriptive documents that are not meant to be display-rendered, but serve essentially as data lumps for some process. XML: Lots of Little DTDs. There are other over the wire data protocols that could use XML. len bullard > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Jan 21 13:44:06 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:59:57 2004 Subject: Conformance in XML processors References: <3.0.1.16.19980118015508.39773ba2@pop3.demon.co.uk> <3.0.1.16.19980118122247.1fff0990@pop3.demon.co.uk> <3.0.1.16.19980118231824.22ffce06@pop3.demon.co.uk> Message-ID: <34C5F51E.B2C4C2AF@technologist.com> Peter Murray-Rust wrote: > I am happier with yours :-) [You seem to have newlines in some tags and not > others, is this intended?] Actually, that was just a typo. The idea should be one "concept" per line. Even attributes might go on separate lines. But you've reminded me that my normalization adds whitespace, which in XML is equivalent to adding content. Perhaps tags *would* have to be broken across lines. > The only reason - and it's probably not "good" - is that the effort to > create or install a solution is too great for the problem at hand. And it > costs money and time. That makes sense. Sometimes XML is too expensive altogether. I think that we are coming to agree that if you invent something that looks like XML, but is simpler, you should just not call it XML. Call it XML-like. > Yes - I like this. Is your use of 'xmlnorm' fictitious, or is such a beast > emerging from the current tools. I wasn't thinking about any particular software when I wrote it, but it was so easy to develop that I guessed it existed somewhere. Check the message from Alex Milowski in comp.text.sgml that describes how he converts SGML to XML. Also Jade can easily be tricked into converting arbitrary SGML (and thus XML) into a standardized XML format. I'll bet it would only be a few hours work to get Jumbo to do the same. Paul Prescod -- "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Jan 21 20:51:00 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:57 2004 Subject: SAXDOM 0.1 Release Announcement Message-ID: <000301bd26ad$8d7cdc90$2ee044c6@donpark> Fellow XML Developers, In the course of changing my application to use SAX (Simple API for XML), I went overboard and ended up writing SAX to W3C DOM bridge which I thought I would share with you. I called the bridge SAXDOM (please no jokes!;-) and it can be found at: http://www.quake.net/~donpark/saxdom.html Have fun, Don Park donpark@quake.net Come visit my XML Example Catalog at http://www.quake.net/~donpark/xmlcat.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From grk at arlut.utexas.edu Wed Jan 21 23:13:35 1998 From: grk at arlut.utexas.edu (Glenn R. Kronschnabl) Date: Mon Jun 7 16:59:57 2004 Subject: PSGML, XML, and sgml-set-face Message-ID: <199801212313.RAA15825@mail-firewall.arlut.utexas.edu> Anyone using faces with DM's XML mods to PSGML? I can get hilighting to work with .sgml files ok, but not with .xml files. If someone has it working, can they be kind of enough to e-mail or post? Thanks. Cheers, Glenn -------------------- Glenn R. Kronschnabl Applied Research Laboratories | grk@arlut.utexas.edu (PGP/MIME ok) The University of Texas at Austin | http://www.arlut.utexas.edu/~grk PO Box 8029, Austin, TX 78713-8029 | (Ph) 512.835.3642 (FAX) 512.835.3808 10,000 Burnet Road, Austin, TX 78758 | ... but an Aggie at heart! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 22 01:02:02 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: PSGML, XML, and sgml-set-face In-Reply-To: <199801212313.RAA15825@mail-firewall.arlut.utexas.edu> References: <199801212313.RAA15825@mail-firewall.arlut.utexas.edu> Message-ID: <199801220059.TAA00370@unready.microstar.com> Glenn R. Kronschnabl writes: > Anyone using faces with DM's XML mods to PSGML? > > I can get hilighting to work with .sgml files ok, > but not with .xml files. > > If someone has it working, can they be kind of > enough to e-mail or post? Do you use the sgml-markup-faces variable? Here's how I set it: (setq-default sgml-markup-faces '((comment . sgml-comment-face) (doctype . sgml-doctype-face) (end-tag . sgml-end-tag-face) (entity . sgml-entity-face) (ignored . sgml-ignored-face) (ms-end . sgml-ms-end-face) (ms-start . sgml-ms-start-face) (pi . sgml-pi-face) (sgml . sgml-sgml-face) (short-ref . sgml-short-ref-face) (start-tag . sgml-start-tag-face))) This works with XML as well as SGML, at least under Unix: of course, you have to create the individual faces. If anyone is interested in the complete code (including the face definitions), I'll be happy to e-mail privately. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 22 03:36:45 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: PSGML, XML, and sgml-set-face In-Reply-To: <199801220248.CAA05889@nathaniel.eps.inso.com> References: <199801220059.TAA00370@unready.microstar.com> <199801220248.CAA05889@nathaniel.eps.inso.com> Message-ID: <199801220333.WAA00661@unready.microstar.com> Gavin Nicol writes: > I did something like this, but I still found that highlighting > didn't work. Finally, I had to hack psgml-parse.el in a couple of > places because it had something like (if xml-mode ....) and this > forced the logic to branch, skipping a couple of places that had > some effect on faces (type tagging I think). Thanks for the note -- I'd appreciate seeing the patches. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 22 05:23:43 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:59:57 2004 Subject: SAXDOM 0.1 Release Announcement In-Reply-To: <000301bd26ad$8d7cdc90$2ee044c6@donpark> Message-ID: <3.0.1.16.19980122044756.30b720cc@pop3.demon.co.uk> At 12:45 21/01/98 -0800, Don Park wrote: >Fellow XML Developers, > >In the course of changing my application to use SAX (Simple API for XML), I >went overboard and ended up writing SAX to W3C DOM bridge which I thought I >would share with you. > >I called the bridge SAXDOM (please no jokes!;-) and it can be found at: > >http://www.quake.net/~donpark/saxdom.html Thanks very much indeed Don, this sounds wonderful. [I haven't been able to have a look yet.] I have been converting Jumbo to SAX - not yet finished. Not because it's difficult, but because I have upgraded to the latest Lark and AElfred as part of it and that has caused other things that I've had to hack through the software. SAX looks wonderful to me and I congratulate David on it and the implementations that he has provided. I have two questions: - can SAX be used to work with 'sysin' input? At present (rightly) everything is in terms of java.net.URL or the equivalent SysIDs. However it is possible that we may come across chunks of 'raw' XML being emitted by tools which don't have a URL address, like (UNIX-like): ls | ls2xml | mysaxapp (where ls2xml is a fictitious tool that takes the output of ls and emits WF XML and mysaxapp is an XML application that takes XML and (say) draws a GUI representation of it. - is it still possible to process non-SAX events from Lark, AElfred, etc. Does one hack LarkDriver, etc? [This may be trivially obvious when I get that far...] P. > >Come visit my XML Example Catalog at >http://www.quake.net/~donpark/xmlcat.html > I will :-) Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 22 13:39:32 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: SAX Requests In-Reply-To: <3.0.1.16.19980122044756.30b720cc@pop3.demon.co.uk> References: <000301bd26ad$8d7cdc90$2ee044c6@donpark> <3.0.1.16.19980122044756.30b720cc@pop3.demon.co.uk> Message-ID: <199801221336.IAA00274@unready.microstar.com> Peter Murray-Rust writes: > - can SAX be used to work with 'sysin' input? At present (rightly) > everything is in terms of java.net.URL or the equivalent SysIDs. However it > is possible that we may come across chunks of 'raw' XML being emitted by > tools which don't have a URL address, like (UNIX-like): > ls | ls2xml | mysaxapp > (where ls2xml is a fictitious tool that takes the output of ls and emits WF > XML and mysaxapp is an XML application that takes XML and (say) draws a > GUI representation of it. You could do it now in Java by setting up a custom URI protocol, but that's messy. This is near the top of my TODO list for the next couple of weeks, when I have time to return to SAX. > - is it still possible to process non-SAX events from Lark, AElfred, etc. > Does one hack LarkDriver, etc? [This may be trivially obvious when I get > that far...] I'd be interested in suggestions for this one -- it would have to be enabled on a driver-by-driver basis, and wouldn't be part of the SAX spec. On the same note, I should add constructors for the drivers that take an existing instance of an AElfred, MSXML, Lark, or NXP parser. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Jan 22 14:23:17 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:57 2004 Subject: Stream-based SAX and other issues Message-ID: <005101bd2740$95410b20$2ee044c6@donpark> > can SAX be used to work with 'sysin' input? At present (rightly) >everything is in terms of java.net.URL or the equivalent SysIDs. However it >is possible that we may come across chunks of 'raw' XML being emitted by >tools which don't have a URL address, like (UNIX-like): > ls | ls2xml | mysaxapp >(where ls2xml is a fictitious tool that takes the output of ls and emits WF >XML and mysaxapp is an XML application that takes XML and (say) draws a >GUI representation of it. This is very interesting. Recently, I had to face the exact opposite of this issue while adding JAF (Java Activation Framework) support in my app. Basically, JAF allows mapping between MIME/File types and 'commands' so that applications can view/edit/dice just about any data given its MIME type: instantiate MIME handler and drop it into the viewer if it is a GUI component. The problem was that commands dealt with only streams so that commands can act upon only one stream of data and not on multi-stream data. For example, my HTML didn't know how to find images and my XML viewer couldn't read the DTDs. The solution I worked out with JavaSoft is not very appealing but practical. Data source context information (i.e. base URL) are obtained by casting up (instanceof mambo). For SAX, I think we need to model the data source so that we can do the same and provide File and URL data source implementations in SAX. BTW, W3C DOM provides DocumentContext interface for applications to retrieve data source infomation (well, there aren't any methods yet, but I hear it is on the way). > - is it still possible to process non-SAX events from Lark, AElfred, etc. >Does one hack LarkDriver, etc? [This may be trivially obvious when I get >that far...] Yes, it is possible but its quite messy. SaxDocumentParser (probably should be renamed to SaxDocumentManager), just takes a driver class name along with a URL and returns a DocumentContext object. In order for SAXDOM to handle each driver specially, I could add a 'driver-driver' (i.e. your significant other sitting on the passenger side and telling you how to drive) so that each driver is automatically mapped to a DriverHandler interface. Given that each DriverHandler is driver specific, they can do whatever they want with the underlying driver implementation. Don Park BTW, I forgot to mention in my SAXDOM announcement that SAXDOM has been tested with AElfred and MSXML SAX drivers. It works great. Great job, David! I am planning to put together a JavaScript-based demo for SAXDOM similar to MSXML demos. It will be interesting! SAXDOM documentation (the source code ;-) will be improved sometime after that. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 22 16:39:43 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:57 2004 Subject: SAXDOM 0.1 Release Announcement Message-ID: <3.0.32.19980122083929.0094a2c0@pop.intergate.bc.ca> At 04:47 AM 22/01/98, Peter Murray-Rust wrote: > - is it still possible to process non-SAX events from Lark, AElfred, etc. >Does one hack LarkDriver, etc? [This may be trivially obvious when I get >that far...] Um, once I'm convinced that SAX is stabilized, I'll remove the need for LarkDriver by producing a Lark.java that does SAX natively; I can get away with this because I use a preprocessor. So if you want SAX + some extra that comes with your parser, this will be straightforward. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Jan 22 18:03:24 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 16:59:57 2004 Subject: SAX: org.xml.sax.Parser needs getXXXHandler methods Message-ID: <002b01bd275f$6801b280$2ee044c6@donpark> I just realized that current version of org.xml.sax.Parser does not allow handler chaining. An implementation of the Parser might have some default handlers installed which will get booted when setXXXHandler methods are called. Also some post processor might want to add some handlers on top of application's own handlers. I would like to see following three methods added to the API. public EntityHandler getEntityHandler (); public DocumentHandler getDocumentHandler (); public ErrorHandler getErrorHandler (); In addition, I would like to have following two methods added to the Parser API for driver-specific operations: public Object getDriverProperty(String name); public Object setDriverProperty(String name, Object value); Property names should be prefixed with some unique values to avoid confusing other drivers. Note that above methods can be invoked without knowing which driver is actually being used. For example: parser.setDriverProperty("SuperDriver.lowercaseElements", Boolean.TRUE); parser.setDriverProperty("HungryDriver.cacheSize", new Integer(100000)); Having fun, Don "SAX Machine" Park donpark@quake.net Come visit my XML Example Catalog at http://www.quake.net/~donpark/xmlcat.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 22 18:23:33 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: SAXDOM 0.1 Release Announcement In-Reply-To: <3.0.32.19980122083929.0094a2c0@pop.intergate.bc.ca> References: <3.0.32.19980122083929.0094a2c0@pop.intergate.bc.ca> Message-ID: <199801221820.NAA00324@unready.microstar.com> Tim Bray writes: > At 04:47 AM 22/01/98, Peter Murray-Rust wrote: > > - is it still possible to process non-SAX events from Lark, AElfred, etc. > >Does one hack LarkDriver, etc? [This may be trivially obvious when I get > >that far...] > > Um, once I'm convinced that SAX is stabilized, I'll remove the need > for LarkDriver by producing a Lark.java that does SAX natively; I > can get away with this because I use a > preprocessor. So if you want SAX + some extra that > comes with your parser, this will be straightforward. -Tim Yes, I expect that this will be a common approach. Parser authors in Java can either extend org.xml.sax.DocumentHandler, or (better) they can add additional interfaces. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamsden at us.ibm.com Thu Jan 22 18:49:44 1998 From: jamsden at us.ibm.com (Jim Amsden) Date: Mon Jun 7 16:59:57 2004 Subject: Content roles in XML Message-ID: <5040100013910127000002L072*@MHS> I'm very new to XML, and in developing a DTD for JAR files, JavaBeans, and Rational Rose petal files, I experienced a recurring problem. The EventSet element of the JavaBeans DTD is exemplary. Here's a fragment of the JavaBeans DTD I came up with: The content of an Event set includes two required methods, and a collection of other methods. In the DTD, there's no way that I know of to indicate the roles these methods play in the EventSet. I would like to say something like: where addListenerMethod, removeListenerMethod, and eventMethod are all Method elements. This more clearly describes the content of an EventSet and avoids using positioning only to capture the meaning of element content. I could use parameter entities to achieve this effect as in: Is this reasonable? Good XML DTD style? Not too much of a runtime overhead? A common practice? Note that this probably wouldn't help with the parsed XML as there would be a Method element for each method. You couldn't ask an EntitySet element for it's addListenerMethod content like you could ask it for it's isUnicast attribute. You'd have to know to get the first Method in the content. Of course an extensible parser with factory methods for constructing parse tree nodes could hide the position dependence and provide more meaningful accessors. I guess what I'm looking for is a way to capture (using UML terms) the association roles between the EventSet Class and the Method Class. There are 3 associations between these two classes, and I need a way to distinguish them. Anyone have any other ideas? Has anyone else experienced this situation? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at light.demon.co.uk Thu Jan 22 19:27:42 1998 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:59:57 2004 Subject: Content roles in XML In-Reply-To: <5040100013910127000002L072*@MHS> Message-ID: In message <5040100013910127000002L072*@MHS>, Jim Amsden writes >The content of an Event set includes two required methods, and a collection of >other methods. In the DTD, there's no way that I know of to indicate the roles >these methods play in the EventSet. I would like to say something like: > >eventMethod+)> > %FeatureDescriptor; > > listenerType CDATA #REQUIRED > isInDefaultEventSet (true | false) "false" > isUnicast (true | false) "false" >> > >where addListenerMethod, removeListenerMethod, and eventMethod are all Method >elements. This more clearly describes the content of an EventSet and avoids >using positioning only to capture the meaning of element content. I could use >parameter entities to achieve this effect as in: > > > > > >%eventMethod;+)> > %FeatureDescriptor; > > listenerType CDATA #REQUIRED > isInDefaultEventSet (true | false) "false" > isUnicast (true | false) "false" >Is this reasonable? Good XML DTD style? Not too much of a runtime overhead? A >common practice? Note that this probably wouldn't help with the parsed XML as >there would be a Method element for each method. You couldn't ask an EntitySet >element for it's addListenerMethod content like you could ask it for it's >isUnicast attribute. You'd have to know to get the first Method in the content. >Of course an extensible parser with factory methods for constructing parse tree >nodes could hide the position dependence and provide more meaningful accessors. Can't you just declare an attribute list for Method which includes a MethodRole attribute? That way, the information is still available in the parsed document. Using parameter entities in the way you suggest is really not a good idea for the reasons you outline - your intent is clear to a human reader looking at your DTD, but the subtle distinction is long gone by the time software gets to look at your instances! Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Thu Jan 22 19:32:24 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:59:57 2004 Subject: Content roles in XML In-Reply-To: <98Jan22.135826est.18817@thicket.arbortext.com> Message-ID: <3.0.5.32.19980122142718.00a38dd0@village.doctools.com> At 01:50 PM 1/22/98 -0500, Jim Amsden wrote: >I'm very new to XML, and in developing a DTD for JAR files, JavaBeans, and >Rational Rose petal files, I experienced a recurring problem. The EventSet >element of the JavaBeans DTD is exemplary. Here's a fragment of the JavaBeans >DTD I came up with: > > > %FeatureDescriptor; > > listenerType CDATA #REQUIRED > isInDefaultEventSet (true | false) "false" > isUnicast (true | false) "false" >> Even before I continued reading, my reaction to the Method, Method, Method+ part of your content model was that "There have got to be some additional semantics here"... >The content of an Event set includes two required methods, and a collection of >other methods. In the DTD, there's no way that I know of to indicate the roles >these methods play in the EventSet. I would like to say something like: > >eventMethod+)> > %FeatureDescriptor; > > listenerType CDATA #REQUIRED > isInDefaultEventSet (true | false) "false" > isUnicast (true | false) "false" >> > >where addListenerMethod, removeListenerMethod, and eventMethod are all Method >elements. This more clearly describes the content of an EventSet and avoids >using positioning only to capture the meaning of element content. I believe this is the ideal solution. It doesn't matter if addListenerMethod, removeListenerMethod, and eventMethod all share identical content models and even attribute lists; they are obviously still three different-enough things to get their own element types. (Note that if you just had Method with an attribute indicating which of the three types any one element is, you'd get the same processing power but not the same validation power -- that is, you couldn't use the DTD to check that at least three Methods are present.) >I could use >parameter entities to achieve this effect as in: > > > > > >%eventMethod;+)> > %FeatureDescriptor; > > listenerType CDATA #REQUIRED > isInDefaultEventSet (true | false) "false" > isUnicast (true | false) "false" Using parameter entities would clarify for you, the DTD writer and maintainer, what's going on. However, it doesn't expose the semantics to application software. Parameter entities can be thought of as just "macros," as far as your purpose is concerned. So having three different element types, while seemingly similar to the parameter entity solution, is radically more powerful. >Is this reasonable? Good XML DTD style? Not too much of a runtime overhead? A >common practice? Note that this probably wouldn't help with the parsed XML as >there would be a Method element for each method. You couldn't ask an EntitySet >element for it's addListenerMethod content like you could ask it for it's >isUnicast attribute. You'd have to know to get the first Method in the content. >Of course an extensible parser with factory methods for constructing parse tree >nodes could hide the position dependence and provide more meaningful accessors. > >I guess what I'm looking for is a way to capture (using UML terms) the >association roles between the EventSet Class and the Method Class. There are 3 >associations between these two classes, and I need a way to distinguish them. > >Anyone have any other ideas? Has anyone else experienced this situation? This is a common problem in SGML/XML modeling. DTD designers are often reluctant to invent new element types if the structure would be identical to element types the designer has already "bought." However, I believe this is false economy. Your application software will still have to treat the first Method, the second Method, and the third-through-nth Methods differently, so it sure smells like you've got three different things. :-) By creating unique element types, you expose the meaning to both software and humans. This isn't to say that it's not useful to use context to treat an element differently; deciding these points is a matter of "feel" sometimes. But, overall, I'd rather use parentage than linear-order context to determine fundamental processing of elements. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamsden at us.ibm.com Thu Jan 22 20:06:30 1998 From: jamsden at us.ibm.com (Jim Amsden) Date: Mon Jun 7 16:59:57 2004 Subject: Content roles in XML Message-ID: <5040100013917021000002L012*@MHS> I just thought of a simple mechanism for describing the role an element plays in the content model for an element. What's needed is a way to give names for content elements (which currently only give the content type). How about something like roleName=contentElement (e.g., addListenerMethod=Method). This is similar to how attribute names define the role a value plays in describing a characteristic of an element. Default roles could be the same as the contentElement name, the same as in UML. So, this could be a simple upwardly compatible extension to XML. It's also simpler and more regular than using parameter entities. The example could then look like: Notice that I used the default for all the event methods because that's a reasonable role name. The DOM could retain the role name for the content and allow queries based on this name the same way it does for attributes. This would make the DOM much more semantically rich and would reduce the need to create a lot of customizable nodes. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Jan 22 21:01:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:57 2004 Subject: Architectural Forms (was Re: Content roles in XML) In-Reply-To: References: <5040100013910127000002L072*@MHS> Message-ID: <199801222057.PAA01543@unready.microstar.com> Richard Light writes: > Can't you just declare an attribute list for Method which includes > a MethodRole attribute? That way, the information is still > available in the parsed document. Using parameter entities in the > way you suggest is really not a good idea for the reasons you > outline - your intent is clear to a human reader looking at your > DTD, but the subtle distinction is long gone by the time software > gets to look at your instances! The opposite solution would be to allow all of the different element types, but to derive them all from the same architectural form in a base architecture (say, "java-generic"): [...] Now, during processing, you can look at the value of the "java-generic" attribute to see what generic class any element belongs to (authors don't need to specify the attribute value, because it's already provided in the DTD). Architectural forms are defined in annex A to ISO 10744: see http://www.ornl.gov/sgml/wg8/document/1957.htm for the XML-specific syntax, and http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html for all the gory details. Eliot Kimber has posted the URL for a simpler tutorial, but I don't have it on hand right now. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 23 01:34:11 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:58 2004 Subject: Netscape Navigator 5.0 Message-ID: <199801230131.UAA00320@unready.microstar.com> Althought it's always best to wait for the fine print, the announcement that Netscape will release the source code for 5.0 and will allow redistribution of modified versions is very interesting for XML developers. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at interlog.com Fri Jan 23 08:35:14 1998 From: smith at interlog.com (Chris Smith) Date: Mon Jun 7 16:59:58 2004 Subject: Announcements (was Netscape Navigator 5.0) In-Reply-To: <199801230131.UAA00320@unready.microstar.com> Message-ID: On Thu, 22 Jan 1998, David Megginson wrote: > Althought it's always best to wait for the fine print, the > announcement that Netscape will release the source code for 5.0 and > will allow redistribution of modified versions is very interesting for > XML developers. As I understand it, it's now official. But speaking of announcments - the voting period on the PR for XML expired on Jan 5, and the status was supposed to be announce within 14 days after that - Jan 19. Did I miss something? --------------------------------------------------------------------------- Chris Smith xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Fri Jan 23 08:49:14 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 16:59:58 2004 Subject: Announcements (was Netscape Navigator 5.0) In-Reply-To: Message-ID: <9801230849.AA03439@lute.apsdc.ksp.fujixerox.co.jp> Chris Smith writes: >But speaking of announcments - the voting period on the PR for XML >expired on Jan 5, and the status was supposed to be announce within 14 >days after that - Jan 19. Did I miss something? The voting period was extended to 20 Jan 1998. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Fri Jan 23 15:22:03 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 16:59:58 2004 Subject: Serving pages with LtXML Message-ID: <34C8707D.D89AC1D2@essc.psu.edu> Is anyone using LtXML to serve web pages? I'm converting most everything on the website I manage into custom XML applications. I've found a neat trick to do it and was wondering if anyone had similiar experiences. When a document is requested, say "http://dork.net/faq". This actually points to "../faq/index.shtml". This file contains a server-side include that executes a script: " ]> Or, using a DTD-less document (my general preference for XML): The only thing the DOCTYPE declaration provides is the convenience of default attribute values--it doesn't affect the interpretation of the mapping. This is cool because it means you can completely avoid per-document declarations while still having the option of validating against the architectural declarations, if provided. In addition, the architectural declaration makes it clear what the governing semantic definition(s) are. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Jan 26 05:35:15 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:59:58 2004 Subject: XML parser, test cases available Message-ID: <34CC1EE0.B4A8FF02@jclark.com> I am happy to announce the availability of a new XML parser in Java, which I'm tentatively calling XP, along with an expanded collection of test cases. More information is available at http://www.jclark.com/xml/. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From terje at in-progress.com Mon Jan 26 20:12:22 1998 From: terje at in-progress.com (terje@in-progress.com) Date: Mon Jun 7 16:59:58 2004 Subject: XML Serving Solutions Message-ID: At 11:08 AM 1/23/98, David Megginson wrote: >Jeremie Miller writes: > > > I'm wondering what everyone else thinks about this issue. When a > > server-side solution is used to dynamically modify XML content into HTML so > > existing browsers can render it appropriately, what happens to a browser > > that _can_ deal with the XML or XML + XSL? Or what happens to an > > intelligent spider that understands XML? As far as I can tell, right now > > nothing happens, they get HTML just like anyone else. But so much is lost > > and it nullifies much of the power of XML and the meta information it > > contains. A server can negotiate the served content to minimize loss of meta information. It doesn't yet make sense to serve XML directly, but trust me, as soon as it does you'll get a solution that maintain the highest possible encoding level. >This should not really be a problem -- the link for the rendered HTML >will be different (it will point to a CGI or servlet, usually), while >there can be a direct link to the XML if someone wants to make it >available. No, the link to a HTML page generated from XML can be exactly the same as if the page was a static file. It all depends on the flexibility of the server and CGI. For example, our Mac web server companion "Interaction" generates HTML pages on the fly from XML, but can easily be set up so that any URL of a previous site can be served by the application. -- Terje | Media Design in*Progress C a s c a d e... a comprehensive Cascading Style Sheets editor for Mac XPublish - for efficient website publishing with XML Make your Web Site a Social Place with Interaction! Check out our web tools at xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Mon Jan 26 20:14:35 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD Message-ID: <34CCA9D0.70D2E0EA@essc.psu.edu> Is there an XML or SGML application to describe the vCard format? A quick scan of the usenet and web didn't turn up anything, probably because there are a lot of vCards published out there appended to posts. If not, I'm going to take a crack at it--any one want to help? I've a long list of people, organizations, and other contact information, and it would be nice to work within a 'standard'. The vCard XML document could be easily transformed into a "text/x-vcard" document-link or included in an HTML page. Jason Cupp -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 354 bytes Desc: Card for Cupp, Jason Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980126/f13c04d1/vcard.vcf From larsga at ifi.uio.no Mon Jan 26 20:36:03 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD In-Reply-To: <34CCA9D0.70D2E0EA@essc.psu.edu> References: <34CCA9D0.70D2E0EA@essc.psu.edu> Message-ID: * Jason R. Cupp | | Is there an XML or SGML application to describe the vCard format? A | quick scan of the usenet and web didn't turn up anything, probably | because there are a lot of vCards published out there appended to | posts. vCard was originally created by the Versit Consortium, but has been turned over to the Internet Mail Consortium, which has developed it into an IETF draft. You can find more info at A quick search of the text draft did not show the string "XML" anywhere, so there may not be any such efforts afoot. An email to the consortium should set the record straight. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rdaniel at lanl.gov Mon Jan 26 20:52:30 1998 From: rdaniel at lanl.gov (Ron Daniel Jr.) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD Message-ID: <3.0.32.19980126134536.009c84e0@cic-mail.lanl.gov> At 07:20 AM 1/26/98 -0800, Jason R. Cupp wrote: >Is there an XML or SGML application to describe the vCard format? A >quick scan of the usenet and web didn't turn up anything, probably >because there are a lot of vCards published out there appended to posts. As a matter of fact, I talked with Paul Hoffman (Director of the Internet Mail Consortium, which owns the rights to vCard) about this just last week. There is no XML encoding that he knows of, but he encouraged me to define one. The only constraint is that we don't call it "vCard" since that is a trademark they don't want to dilute. He suggested calling it something like "vCard-XML". He also suggested that it be based on the IETF version of the vCard schema, which offers a couple of improvements over the 2.1 spec that is current. After the IETF issues the "vCad MIME Directory Profile" as an RFC, IMC is going to call it vCard 3.0. I took a quick look at the Internet draft he was talking about, an XML version of that info does not look too hard. However, since I'm involved with the RDF effort, I was thinking about what an RDF-compliant version of it might look like. I've gotten distracted by real work in the meantime, but I'd be happy to help with the effort. Ron Daniel Jr. voice:+1 505 665 0597 Advanced Computing Lab fax:+1 505 665 4939 MS B287 email:rdaniel@lanl.gov Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel Los Alamos, NM, USA, 87545 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Tue Jan 27 00:37:36 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD References: <3.0.32.19980126134536.009c84e0@cic-mail.lanl.gov> Message-ID: <34CCE788.F31B3166@essc.psu.edu> Ron Daniel Jr. wrote: > > There is no XML encoding that he knows of, but he encouraged me to > define one. The only constraint is that we don't call it "vCard" > since that is a trademark they don't want to dilute. He suggested > calling it something like "vCard-XML". He also suggested that it be > based on the IETF version of the vCard schema, which offers a couple > of improvements over the 2.1 spec that is current. After the IETF > issues the "vCad MIME Directory Profile" as an RFC, IMC is going to > call it vCard 3.0. > > I took a quick look at the Internet draft he was talking about, > an XML version of that info does not look too hard. I'm new to DTDs (markup seems to always come first), but how does this look for a VERY basic start? I used the vCard specification and the FGDC (Federal Geographic Data Committee "http://www.fgdc.gov") metadata standard as a model. If there are more than two people willing to pitch in, I'd be willing to host a web-page to keep everyone up-to-date. ]> -- Jason R. Cupp (jcupp@essc.psu.edu) Deasy GeoGraphics The Pennsylvania State University xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue Jan 27 00:42:15 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD In-Reply-To: <34CCE788.F31B3166@essc.psu.edu> (jcupp@essc.psu.edu) Message-ID: <199801270046.TAA08441@geode.ora.com> [Jason R. Cupp] > This content model is not legal. Any mixed content must be of the forms: (#PCDATA) (#PCDATA | element1 | element2 | ... | elementN)* I'd recommend just leaving the #PCDATA out of the name, but that will make migration from existing vcards more difficult, since they don't distinguish the fields of the name, do they? -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Tue Jan 27 00:59:27 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD References: <199801270046.TAA08441@geode.ora.com> Message-ID: <34CCECBF.4BC1F7C7@essc.psu.edu> Chris Maden wrote: > > [Jason R. Cupp] > > > > This content model is not legal. Any mixed content must be of the > forms: > > (#PCDATA) > (#PCDATA | element1 | element2 | ... | elementN)* > > I'd recommend just leaving the #PCDATA out of the name, but that will > make migration from existing vcards more difficult, since they don't > distinguish the fields of the name, do they? Do the content models in XML and SGML differ here? I've seen the first example around somewhere. Vcards offer two items: formatted name for looks and a structured name. Replicating this in XML could look like this: Santa Clause ... ... ... ... Which would translate more directly into Vcard, but have extra information for a straight HTML,TEXT conversion. Or, SantaClause Conversion to HTML or TEXT would either ignore the tags inside or use them to generate an index. Would it be tricker to get this into Vcard? -- Jason R. Cupp (jcupp@essc.psu.edu) Deasy GeoGraphics The Pennsylvania State University xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue Jan 27 01:08:25 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:58 2004 Subject: vCard DTD In-Reply-To: <34CCECBF.4BC1F7C7@essc.psu.edu> (jcupp@essc.psu.edu) Message-ID: <199801270112.UAA15449@geode.ora.com> [Jason R. Cupp] > Chris Maden wrote: > > [Jason R. Cupp] > > > > > > > This content model is not legal. Any mixed content must be of the > > forms: > > > > (#PCDATA) > > (#PCDATA | element1 | element2 | ... | elementN)* > > Do the content models in XML and SGML differ here? I've seen the > first example around somewhere. Yes. Your example is legal SGML, but not XML. It's what's called "pernicious mixed content", and though legal, is a very bad idea.[1] XML left banned it intentionally. > Vcards offer two items: formatted name for looks and a structured > name. Then I think I'd recommend two different kinds of names, one with content model (#PCDATA) and the other with (firstname,surname,...). Then use (struct-name|free-name) in the parent content model. -Chris [1] If you really want to know why, search comp.text.sgml on DejaNews or e-mail me. -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wendling at ganymede.isdn.uiuc.edu Tue Jan 27 02:20:10 1998 From: wendling at ganymede.isdn.uiuc.edu (Bill Wendling) Date: Mon Jun 7 16:59:58 2004 Subject: Empty Tags Message-ID: Hello, Is there a way in XML to make a tag conditionally empty? That is, if you have this declaration: if the attrset attribute of syntax has a value, could the user type it as: instead of ? || Bill Wendling wendling@ncsa.uiuc.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Per-Ake.Ling at uab.ericsson.se Tue Jan 27 05:26:50 1998 From: Per-Ake.Ling at uab.ericsson.se (Per-Ake Ling) Date: Mon Jun 7 16:59:59 2004 Subject: Empty Tags Message-ID: <199801270526.GAA21774@uabs19c27.eua.ericsson.se> > From wendling@ganymede.isdn.uiuc.edu Tue Jan 27 03:20:37 1998 ...[snip] > Is there a way in XML to make a tag conditionally empty? That is, if you > have this declaration: > > > > > > if the attrset attribute of syntax has a value, could the user type it as: > > > > instead of > > > ...[snip] There is an excellent way of doing it in SGML, but no way of doing it in XML (other than having a well-formed document with no DTD). In SGML: will give you the behaviour you ask for, but this feature was one of the "bothersome" ones that where removed from XML. Our own DTDs unfortunately rely on CONREF and it will take some thought to rewrite them in a sensible manner without compromising the documents too much. CONREF is one of the features I miss most in XML. Per-Åke -- Per-Åke Ling (note: Per-Åke, transliteration Per-Ake) email: Per-Ake.Ling@uab.ericsson.se phone: +46 8 727 5674 Ericsson Utvecklings AB mobile: +46 70 790 2446 AXE Research and Development fax: +46 8 727 3463 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Jan 27 09:25:38 1998 From: ht at cogsci.ed.ac.uk (Henry Thompson) Date: Mon Jun 7 16:59:59 2004 Subject: Empty Tags In-Reply-To: Bill Wendling's message of Mon, 26 Jan 1998 20:19:57 -0600 (CST) References: Message-ID: Bill Wendling writes: > Is there a way in XML to make a tag conditionally empty? That is, if you > have this declaration: > > > > > > if the attrset attribute of syntax has a value, could the user type it as: > > > > instead of > > > Both of those are valid (and I use the word carefully) XML regardless of the declaration of the attribute and almost* regardless of the content model of the element. The proposed recommendation text reads "Empty-element tags may be used for any element which has no content, whether or not it is declared using the keyword EMPTY." So you can write what you wanted, but you can't get a guarantee that you've only used the empty tag when the attribute is present. *The only case either would be invalid is when both would be invalid, e.g. because the content model for 'syntax' and some REQUIRED content. ht -- Henry S. Thompson, Human Communication Research Centre, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.cogsci.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 27 11:57:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:59 2004 Subject: SAX Java Implementation: Bug Fixes Message-ID: <199801271152.GAA00931@unready.microstar.com> I have implemented several bug fixes in the SAX Java materials at http://www.microstar.com/XML/SAX/ None of these changes affects the SAX interface itself. I have also updated the list of supported parsers and applications, and hope to add more applications soon (there are a couple just about ready for public announcements). 1) SAX Drivers - fixed bug in AElfred driver that caused driver to crash with #IMPLIED attributes - fixed attribute reporting for MSXML driver, so that all specified attributes are reported (MSXML seems not to report defaulted attributes) - fixed empty-element reporting for Lark driver 2) SAXDemo - fixed to construct better absolute URLs under Windows (thanks to James Clark for the fix) 3) org.xml.sax.HandlerBase - fixed so that derived classes can still throw exceptions from the handlers All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Arjan.Loeffen at let.ruu.nl Tue Jan 27 12:20:58 1998 From: Arjan.Loeffen at let.ruu.nl (Arjan Loeffen) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <34CBB145.BB14D0D9@let.ruu.nl> Dear reader, As far as I can see there is no character (Dutch) FLORIN SIGN or EURO SIGN in unicode (yet). Suppose I need to specify that character; in the old SGML days I simply referenced system data SDATA for euro or florin; the receiving system should be able to cope with that, and it was true data supplied by the system, i.e. could be interpreted as a character (sequence). It seems incorrect to replace it by a data entity or special element, as in both cases the result would not be interpreted as a character, but of a data object/element. In other words, how do I specify my small salary in euro's? Arjan -- Arjan Loeffen Computer & Arts, Faculty of Arts, Utrecht University Arjan.Loeffen@let.ruu.nl http://CandL.let.ruu.nl xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sm at irb.informatik.uni-dortmund.de Tue Jan 27 13:40:46 1998 From: sm at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 16:59:59 2004 Subject: Empty AttlistDecl Message-ID: <199801271340.OAA09592@brown.informatik.uni-dortmund.de> Hi! Production [52] of the XML spec reads AttlistDecl ::= '' A legal AttlistDecl would be What's the purpose of an empty AttlistDecl? Shouldn't production [52] read instead as follows? AttlistDecl ::= '' (+ instead of *) Why was a "Possibly empty attribute definition list" (K.4.4.1, [142], ISO 8879 TC 2, http://www.sgmlsource.com/8879rev/n1955.htm) introduced in SGML with the latest revision? Thanks in advance for any answer! Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com WWW: http://www.informatik.uni-dortmund.de/~sm/ +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 27 13:49:57 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:59 2004 Subject: Empty AttlistDecl In-Reply-To: <199801271340.OAA09592@brown.informatik.uni-dortmund.de> References: <199801271340.OAA09592@brown.informatik.uni-dortmund.de> Message-ID: <199801271344.IAA00314@unready.microstar.com> Stefan Mintert writes: > What's the purpose of an empty AttlistDecl? > Shouldn't production [52] read instead as follows? > > AttlistDecl ::= '' > > (+ instead of *) Actually, it turns out to be remarkably useful for external DTD subsets. Consider this: ]]> Now, if the %security; parameter entity is set to "INCLUDE" (to enable security features), you will get If the %security; parameter entity is set to "IGNORE" (to disable security features), you will get Fortunately, this is still legal in XML. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Jan 27 14:50:20 1998 From: ht at cogsci.ed.ac.uk (Henry Thompson) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE In-Reply-To: Arjan Loeffen's message of Sun, 25 Jan 1998 22:40:22 +0100 References: <34CBB145.BB14D0D9@let.ruu.nl> Message-ID: Arjan Loeffen writes: > As far as I can see there is no character (Dutch) FLORIN SIGN or EURO > SIGN in unicode (yet). Suppose I need to specify that character; in the Misha Wolf passed on the following some time back: > There is a lot of hot air on some of the lists about this. ISO 8859-1 > will not be changed for the Euro, nor for any other reason. > The Unicode Technical Committee has decided to allocate a new character > to be called EURO SIGN to the position 20AC. ISO/IEC JTC1/SC2/WG2 is, > I believe, yet to discuss this proposal. ht -- Henry S. Thompson, Human Communication Research Centre, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.cogsci.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue Jan 27 15:12:59 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE In-Reply-To: <34CBB145.BB14D0D9@let.ruu.nl> (message from Arjan Loeffen on Sun, 25 Jan 1998 22:40:22 +0100) Message-ID: <199801271516.KAA16644@geode.ora.com> [Arjan Loeffen] > As far as I can see there is no character (Dutch) FLORIN SIGN or > EURO SIGN in unicode (yet). Suppose I need to specify that > character; in the old SGML days I simply referenced system data > SDATA for euro or florin; the receiving system should be able to > cope with that, and it was true data supplied by the system, > i.e. could be interpreted as a character (sequence). It seems > incorrect to replace it by a data entity or special element, as in > both cases the result would not be interpreted as a character, but > of a data object/element. > > In other words, how do I specify my small salary in euro's? Most listings of the Mac and Windows character sets, both of which include the florin sign, use /x01/x92 LATIN SMALL LETTER F WITH HOOK for the florin sign. As Henry Thompson has pointed out, /x20/xac has been allocated for the new Euro glyph[1]. But currency is only a small part of your worries. There are a lot of symbols, especially scientific and technical ones, that aren't represented in Unicode. I'm not particularly sanguine about this, but I've accepted a minimal XML for now. We can add SDATA or something similarly useful later. People who have characters beyond it will have to use SGML for now, and come up with some conversion mechanism (probably involving private use areas - yuck). -Chris [1] Yes, glyph. The EC has seen fit to mandate a glyph for the new currency, like they did for the little 'e' on beverage bottles. -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Tue Jan 27 15:32:02 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <01bd2b2a$a95b5dc0$LocalHost@sgml> Arjan >As far as I can see there is no character (Dutch) FLORIN SIGN or EURO >SIGN in unicode (yet). Suppose I need to specify that character; in the >old SGML days I simply referenced system data SDATA for euro or florin; >the receiving system should be able to cope with that, and it was true >data supplied by the system, i.e. could be interpreted as a character >(sequence). It seems incorrect to replace it by a data entity or special >element, as in both cases the result would not be interpreted as a >character, but of a data object/element. > >In other words, how do I specify my small salary in euro's? The official entity name is € - its definition will have the form: if running on a Microsoft system as Microsoft have assigned Hex 80 as the codepoint. TC304 have assigned it to Hexadecimal B1 in ISO 8859, and its ISO 10646 code point is 20AC, so the following should be the formal XML reference to it: ----------------------------------------------------------------- Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com For more information about The SGML Centre contact http://www.sgml.u-net.com For more information about the European Commission's Open Information Interchange (OII) initiative contact http://www.echo.lu/oii/en/oiistand.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue Jan 27 15:42:46 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE In-Reply-To: <01bd2b2a$a95b5dc0$LocalHost@sgml> (mtbryan@sgml.u-net.com) Message-ID: <199801271546.KAA17049@geode.ora.com> [Martin Bryan] > The official entity name is € - its definition will have the > form: > > > > if running on a Microsoft system as Microsoft have assigned Hex 80 > as the codepoint. NO NO NO NO NO NO NO NO! Martin, you should know better! XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. This is *why* ISO/IEC 10646 is the document character set. The number in a numeric character reference is ALWAYS ALWAYS ALWAYS to 10646. NEVER TO THE PLATFORM. This has caused me enough headaches with bad HTML browser implementations, and I have *no* care to repeat it. > TC304 have assigned it to Hexadecimal B1 in ISO 8859, and its ISO > 10646 code point is 20AC, so the following should be the formal XML > reference to it: > > This is (almost) correct (needs quotes). Also correct would be (if I converted correctly). -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Jan 27 16:41:07 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <3.0.32.19980127083624.00a82500@pop.intergate.bc.ca> At 02:50 PM 27/01/98 +0000, Henry Thompson wrote: >> The Unicode Technical Committee has decided to allocate a new character >> to be called EURO SIGN to the position 20AC. ISO/IEC JTC1/SC2/WG2 is, >> I believe, yet to discuss this proposal. This is in the HTML 4.0 DTD, there is an entity € which is (in decimal) what XML would call € -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Jan 27 16:43:18 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <3.0.32.19980127083813.00a82830@pop.intergate.bc.ca> At 01:51 PM 27/01/98 -0000, Martin Bryan wrote: >The official entity name is € - its definition will have the form: > > > >if running on a Microsoft system Not in XML it won't. The standard is crystal clear; numeric character refs are to Unicode values only. Another SGML interoperability rathole sealed up. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Tue Jan 27 16:45:32 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <01bd2b42$e5ed1e00$LocalHost@sgml> >> The official entity name is € - its definition will have the >> form: >> >> >> >> if running on a Microsoft system as Microsoft have assigned Hex 80 >> as the codepoint. > >NO NO NO NO NO NO NO NO! Martin, you should know better! > >XML DECLARATIONS DO NOT DEPEND ON THE PLATFORM. Chris, show me one product that supports ac; - then try using € on latest releases of Microsoft products. You'll then see why I made the distinction. >> > >This is (almost) correct (needs quotes). Also correct would be OK, doing thing too fast again I agree XML needs to use the 10646 value once it becomes officially approved, but does anyone have the faintest when that will be? Martin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue Jan 27 16:51:59 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE In-Reply-To: <01bd2b42$e5ed1e00$LocalHost@sgml> (mtbryan@sgml.u-net.com) Message-ID: <199801271655.LAA21550@geode.ora.com> [Martin Bryan] > Chris, show me one product that supports ac; - then try using > € on latest releases of Microsoft products. You'll then see why > I made the distinction. Show me one product that complies fully with the XML specification and I'll show you one that supports €. I fully understand that € will work on platform-dependent systems like broken HTML browsers. But this is the XML developers' list, and I don't want anyone without your history of experience to get confused by your unqualified statement. For the same reason, I wince whenever anyone mentions CONREF, SDATA, or minimization on this list. I gave a talk last April to the SGML Forum of New York on what XML wasn't, from an SGML point of view. Most of the people in the room were happy. There was one poor fellow who was new to SGML, who was terrified by how complex XML seemed. If we can keep discussions to what XML *is*, or at least carefully qualify any statements about what it isn't, and only use those for contrast, there will be much less danger of confusion for newcomers. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Tue Jan 27 16:52:05 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 16:59:59 2004 Subject: Euro Symbol (was SDATA or UNICODE) Message-ID: <01bd2b43$f4d5d6e0$1e09e391@mhklaptop.bra01.icl.co.uk> Re: the Euro currency symbol: >> TC304 have assigned it to Hexadecimal B1 in ISO 8859 See below > The Unicode Technical Committee has decided to allocate a new character > to be called EURO SIGN to the position 20AC. ISO/IEC JTC1/SC2/WG2 is, > I believe, yet to discuss this proposal. I believe it is approved. The following is from http://www2.echo.lu/oii/en/oiinov97.html#Euro dated November 1997: "The Euro symbol has been assigned hexadecimal value 20AC in the ISO 10646 BMP/Unicode 2 code set. In Microsoft's 1251 Cyrillic code page it will have a hexadecimal value of 88, but in the 1250 Eastern European, 1252 Western European, 1253 Greek, 1254 Turkish and 1257 Baltic code pages it will be assigned hexadecimal value 80. A proposed new part for ISO 8859, known tentatively as Latin 0, suggests assigning 12/01 (hexadecimal B1) to the Euro symbol, replacing the plus and minus sign. (Hexadecimal 80 is a reserved postion in ISO 8859.) " Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Jan 27 17:13:45 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:59:59 2004 Subject: SDATA or UNICODE Message-ID: <3.0.32.19980127091130.00a8475c@pop.intergate.bc.ca> At 04:44 PM 27/01/98 -0000, Martin Bryan wrote: >Chris, show me one product that supports ac; - then try using >€ on latest releases of Microsoft products. You'll then see why I made >the distinction. OK, I can bring up Navigator (I assume that IE can do this too) with a Unicode font, e.g. Cyberbit,and insert the value € and the correct character will display. Today. Try putting in € and see what happens. In XML, ONLY UNICODE VALUES ARE CONFORMANT - vendors will soon learn this and guess what, for the first time we'll have interoperable documents; anyone writing a display driver for naive MS platforms will soon learn that in Europe, they'd better map XML € to 128 for display. Martin, I am dismayed that you of all people are counselling egregious non-conformance in this manner on this forum. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Jan 27 17:44:58 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:59:59 2004 Subject: Specifying custom characters in XML (was Re: SDATA or UNICODE) Message-ID: <199801271750.EAA28159@jawa.chilli.net.au> > From: Tim Bray > Martin, I am dismayed that you of all people are counselling egregious > non-conformance in this manner on this forum. -Tim I don't think he was...in his original post he said that the formal XML reference was AC; but that on Microsoft systems the code 128 worked. This was in reply to a question "In other words, how do I specify my small salary in euro's?" which I took to be a request for a workaround. Maybe we also need to figure when we need to specify characters and when we just need glyphs (pictures). I think the thing that a character has that a glyph may not have is that it is interesting for searching, indexing and collation. If you need a "character" that is not in ISO 10646 but it will not be interesting for searching, indexing or sorting, then you really just want a glyph, and an embedded bitmap or a reference to a particular font may be fine for you, if you can get it looking OK. If you do need to stick in an actual character, then you can make use of the user-defined codepoints available in ISO 10646. Then, as a next step, you need to provide a mechanism in your markup to map the code point to some element or entity or processing instruction. SGML has a mechanism called short-references you can use for this, but you still will have to build it into your software, since it is not part of XML. You can use PIs instead of markup declarations: instead of SGML's in XML you could just use ISO 8879 (i.e. the SGML standard) as the PI target So ISO 10646 gives us free code points we can use. ISO 8879 gives us the short reference mechanism to let us map a code point to any kind of markup we need. XML lets us use ISO 10646 and provides PI targets so we can use any ISO 8879 declaration (that we care to implement) in the body of the instance. So all that is needed is to name the character and give its characteristics as far as collation etc goes, and to point to a glyph. For this the TEI Writing System Declaration may provide some useful conventions. Rick Jelliffe ICQ#7587145 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at arbortext.com Tue Jan 27 17:55:38 1998 From: paul at arbortext.com (Paul Grosso) Date: Mon Jun 7 16:59:59 2004 Subject: How to create URIs out of system ids Message-ID: <98Jan27.125307est.18819@thicket.arbortext.com> The XML 1.0 spec indicates that: [t]he SystemLiteral that follows the keyword SYSTEM [which] is called the entity's system identifier is a URI, which may be used to retrieve the entity." [1] The questions I have are less issues with the spec but more issues of practical implementation, hence this posting. I am trying to grapple with the following question(s): Given an SGML external identifier with a possibly omitted system identifer, how would an application most appropriately generate the system id part for a valid XML ExternalID. The basic scenario is that I'm starting with something that is not necessarily XML (perhaps it's SGML), and I'm trying to automate the process of producing XML, specifically in this case, valid XML ExternalIDs. Here's my cut of the issues. Consider the following cases all allowed by SGML: 1. 2. 3. 4. 5. In cases 1 and 5, we can assume that the application has some way to determine an implicit system id at least some of the time. Note that the relevant section in the XMP PR [1] goes on to say: Unless otherwise provided by information outside the scope of this specification..., relative URIs are relative to the location of the resource within which the entity declaration occurs. A system identifier in general could be: a. a file pathname relative to the location of the resource within which the entity declaration occurs; b. a file pathname relative to something else (e.g., the catalog in which the sysid was found as a result of the public id lookup); c. an absolute file pathname on the local computer's file system; d. a URL relative to the encapsulating entity; e. a URL relative to some other base URL somehow specified; f. an absolute URL; g. empty; h. something else (e.g., "this is garbage"). The basic question is what SystemLiteral to generate to create the most appropriate valid XML ExternalID in each case. Below I'm using the term "sysid" to refer to the system id as specified in the external id or in the catalog, and "URL" to refer to the SystemLiteral that will get put into the XML ExternalID. For a, the relative file pathname would get converted to the equivalent relative URL just by converting the syntax; on Unix and NT, this would consist just of escaping characters not allowed in URLs whereas on DOS-based machines and Macs, etc., it is also the case that the path separator character (\ or :, etc.) would get converted to /. Alternatively, the application could make the sysid absolute and then handle it as case c which would make the document more likely to work if it were moved elsewhere. Thoughts? For b, either the application could try to get fancy and translate the sysid that is relative to something else into one that is relative to the containing document and then handle as case a; otherwise, it could make the sysid absolute and then handle it as case c. For case d, there is nothing to do. Alternatively, it could make the URL absolute and then handle it as case f. For case e, either the application could try to get fancy and translate the URL that is relative to something else into one that is relative to the containing document and then handle as case d; otherwise, it could make the URL absolute and then handle it as case f. For case f, there is nothing to do. For g, the application could leave it empty since that is a valid URL, though probably not what's intended. Or it could write some URL such as "http://unknown.netloc/unknown.url". Any other ideas? For case h, the application could leave it alone and just pass on the "garbage" or it could handle it as case g. Thoughts? For c, I'm not sure what makes the most sense. Presumably, the application could try to get fancy and, if the referenced file is in fact accessible via some http-URL, make the conversion, but this seems tricky and questionable and certainly can't work in all cases. That leaves writing out the absolute file name as a file-scheme URL. (Am I missing some other alternative?) My reading of RFC1738 seems to indicate that, for a file path name of c:\pbg\webpages\pbghome.htm on my local machine, the file-scheme URL could be either: file://localhost/c:/pbg/webpages/pbghome.htm or file:///c:/pbg/webpages/pbghome.htm The latter works in NS3.0 and IE3.0 on my W95 machine (the former works in NS3.0 but not in MS3.0 per my experiments--I think I've heard from others that "localhost" does work now in IE4.0). So it sounds like what I'd do in case c is do the syntax conversion as in case a (e.g., \ to / and escape characters as necessary), then prepend "file:///" to the result. Is that reasonable? Another angle I've heard is that user-specified sysid's (cases 2-4 above) should be left untouched since that's what the user said and only sysid's that the application must intuit (cases 1 and 5) should be subject to any of the massaging I've discussed in a-h above. If you subscribe to "my gun, my bullet, my foot, my health insurance", then I suppose I can see that point. If you subscribe to "do what I mean, not what I say, I'd prefer you made my life smoother despite myself because all this technical stuff shouldn't be so hard to figure out in the first place", then I can see arguments for trying to turn all sysids that aren't already absolute URIs into absolute URIs for maximal portability. I'd be interested in hearing other's thoughts on this. paul [1] http://www.w3.org/TR/PR-xml#sec-external-ent [2] http://www.w3.org/TR/PR-xml Other sources include RFC1738 and RFC1808. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From aaron at mantiscorp.com Tue Jan 27 18:16:16 1998 From: aaron at mantiscorp.com (Aaron E. Walsh) Date: Mon Jun 7 16:59:59 2004 Subject: Netscape Navigator 5.0 References: <199801230131.UAA00320@unready.microstar.com> Message-ID: <34CE2431.EC58ED2D@mantiscorp.com> David Megginson wrote: > > Althought it's always best to wait for the fine print, the > announcement that Netscape will release the source code for 5.0 and > will allow redistribution of modified versions is very interesting for > XML developers. Hi David -- A similar post regarding NN5 was made on the main VRML list (the subject is "Imagine") to which a few responses were made. I'll paste that discussion here since it feeds off your original message: Subject: imagine .... Date: Thu, 22 Jan 1998 16:49:50 -0500 From: Michael Ware Organization: Magnet To: www-vrml@vrml.org Jaw-dropping news... imagine the possibilities for intranet applications; need to make a change to the browser? Find a JavaScript bug and want to fix it? [... complete NN5 press release cut ... ] --- Subject: Re: imagine .... Date: Fri, 23 Jan 1998 12:44:10 -0500 From: "Aaron E. Walsh" Organization: Mantis Development Corp. To: Michael Ware CC: vrml list Michael Ware wrote: > Jaw-dropping news... imagine the possibilities for intranet > applications; need to make a change to the browser? Find a JavaScript > bug and want to fix it? Jaw-dropping, indeed, with astonishing potential. If the good folks at Netscape manage to control the product release process, rather than allowing it to splinter into different flavors, it's a wonderful move. If not, I'd hate to see the end game. I can't imagine that they'll let it get away from them, though, since there's plenty of history to draw on & this is a strategic move designed to keep their product line competitive with the ever present MS contingency. I'll be keeping my eyes wide open either way :) Regards, Aaron --- Subject: Re: imagine .... Date: Fri, 23 Jan 1998 10:26:02 -1000 From: "Philip A. Bralich, Ph.D." To: aaron@mantiscorp.com, Michael Ware CC: vrml list At 07:44 AM 1/23/98 -1000, Aaron E. Walsh wrote: >Michael Ware wrote: >> Jaw-dropping news... imagine the possibilities for intranet >> applications; need to make a change to the browser? Find a JavaScript >> bug and want to fix it? > >Jaw-dropping, indeed, with astonishing potential. If the good folks at >Netscape manage to control the product release process, rather than >allowing it to splinter into different flavors, it's a wonderful move. It is indeed an interesting and dramatic move. To control the product release process all they need do is be sure that only the most stable most crucial applications be included with their release while others that are less widely needed or less stable can exist as independent software developers. They might even get some of their money back from offering this for free by investing in companies who are likely to make a profit but unlikely to be "crucial" enough for the main release. I am already looking forward to browsing the Netscape on-line malls for software, shareware, and free ware tailored specifically for my browser needs. As a developer I am already thinking of products I can sell to Netscape users as well as those to offer to Netscape itself. Phil Bralich Philip A. Bralich, Ph.D. President and CEO Ergo Linguistic Technologies 2800 Woodlawn Drive, Suite 175 Honolulu, HI 96822 Tel: (808)539-3920 Fax: (808)5393924 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Jan 27 20:11:52 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:59:59 2004 Subject: AElfred 1998-01-27 Release (bug fixes only) Message-ID: <199801272007.PAA00904@unready.microstar.com> I have fixed two more bugs in AElfred: 1) For XML documents over 32K using the DOS CR/LF convention, AElfred would crash if a CR appeared exactly on a 32K boundary (!!!) 2) Conditional sections were not being parsed correctly (I must have accidentally deleted a couple of lines in an earlier revision). There are no interface changes. You can download the newest release from http://www.microstar.com/XML/ Thank you to everyone for the very careful and detailed bug reports. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From norbert at datachannel.com Wed Jan 28 03:04:55 1998 From: norbert at datachannel.com (Norbert Mikula) Date: Mon Jun 7 17:00:00 2004 Subject: Microsoft, ArborText, DataChannel and Inso Submit XML-Data Specification to W3C Message-ID: <037c01bd2b99$5b9c9120$9954ccd0@norbert.DataChannel.com> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: Norbert H. Mikula.vcf Type: text/x-vcard Size: 133 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980128/d889c43c/NorbertH.Mikula.vcf From mtbryan at sgml.u-net.com Wed Jan 28 08:23:12 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: <01bd2bbf$259f80c0$LocalHost@sgml> Tim >Martin, I am dismayed that you of all people are counselling egregious >non-conformance in this manner on this forum. -Tim You're right, I was so tired last night I wasn't thinking straight, just copying the reports I did for OII from TC304 without thinking of the consequences. Sorry:-( Martin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Wed Jan 28 08:23:32 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: <01bd2bbe$7435b7a0$LocalHost@sgml> >Show me one product that complies fully with the XML specification and >I'll show you one that supports €. I fully understand that >€ will work on platform-dependent systems like broken HTML >browsers. But this is the XML developers' list, and I don't want >anyone without your history of experience to get confused by your >unqualified statement. My statement was very much qualified, hence your justifiable complaint about my suggesting platform dependent solutions. > >For the same reason, I wince whenever anyone mentions CONREF, SDATA, >or minimization on this list. I gave a talk last April to the SGML >Forum of New York on what XML wasn't, from an SGML point of view. >Most of the people in the room were happy. There was one poor fellow >who was new to SGML, who was terrified by how complex XML seemed. If >we can keep discussions to what XML *is*, or at least carefully >qualify any statements about what it isn't, and only use those for >contrast, there will be much less danger of confusion for newcomers. Agreed, I apologise for muddying the waters unnecessarily. Lets stick to even if this hasn't yet been formally approved by the relevant ISO committee as the code point for the Euro (developers be warned) Martin Bryan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Arjan.Loeffen at let.ruu.nl Wed Jan 28 11:51:47 1998 From: Arjan.Loeffen at let.ruu.nl (Arjan Loeffen) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE References: <01bd2b2a$a95b5dc0$LocalHost@sgml> Message-ID: <34CF1C74.C267A0E8@let.ruu.nl> I'd like to thank the readers for their replies. I'll try to formulate my thoughts on this issue. The original EURO/FLORIN question raised an interesting discussion -to me, anyway- on the status of doing XML 'standardization' work under W3C. In general, for any new model to be defined, we are forced to build the specs (at least in part) in a bottom-up fashion. Notably, for XML we must assume there is a solid way of defining characters, and this must be supported by our software tools. That way we can focus on information structuring and interchange rather than the technical issues that *underly* (but are not essential to) any project for building software. I'd say character specification is an underlying feature. If we are not certain that we have a *complete* way of defining and referencing characters (technically, but also socially as is the case for UNICODE assignment of characters by their external form and use), and if we still want to work on XML, we must find a 'way out' for those cases where standard character description fails. *If* there is no character known as, say, FLORIN, in UNICODE, we must have a way out and be able to represent (reference) that character. Chris Maden: "There are a lot of symbols, especially scientific and technical ones, that aren't represented in Unicode." Martin Bryan's remark: "I agree XML needs to use the 10646 value once it becomes officially approved, but does anyone have the faintest when that will be?" seems somewhat beside the point; I'd say we can be sure UNICODE will never be able to represent *all* characters (I may create a new character every morning at breakfast). When we *know* that there will be characters such as FLORIN in some knowledge domain/language, and if we assume there will *always* be such characters that simply have not been recorded before and/or officially, we must at the very offset of XML standardisation support an escape route, an alternative. That's SDATA in SGML. If we all agree that we can never place all distinguishable characters in one world-wide standard (such as UNICODE), we must introduce the escape route, and not postpone that to a 'future version' of XML. So, in my opinion, following the replies to my earlier question, I think XML should start to support SDATA (as in SGML) and a way of allowing a system to determine what character the replacement text represents (which is 'covered' by public character sets in SGML). PS. Note that SDATA is not a character referencing format: it is a form of allowing the system to insert *any* data when and where the SDATA entity is referenced. Notably, this could be a complete character sequence, or even different data for each reference to the same SDATA entity. Thanks again for the replies. Arjan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Jan 28 12:21:15 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: <199801281228.XAA00154@jawa.chilli.net.au> > From: Arjan Loeffen > So, in my opinion, following the replies to my earlier question, I think XML > should start to support SDATA (as in SGML) and a way of allowing a system to > determine what character the replacement text represents (which is 'covered' by > public character sets in SGML). Does anyone know what cunning plans the W3C font service people are up to? >From my glance they seem to be looking for a way to distribute (= sell?) fonts over the web, while we are looking at how to get a single glyph. Anything XML does should fit in with them, I suppose. Gavin Nicol started a group on this issue a little while back, but I think it was derailed by more immediate issues. Certainly I think a PI system that "shortrefs" user-defined characters to Xpointers that index into a font held remotely would be the best way. Anyone like that idea? Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 28 12:26:46 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:00 2004 Subject: Conditional actions in XSL? Message-ID: <3.0.1.16.19980128075314.31f79096@pop3.demon.co.uk> [Forwarded to the list from Dr. Zheng Min] >> >>>I played with XSL and MSXSL in the last few days. One thing I'm trying to >>do >>>is to drop all the elements with date attribute < a specified date. >>However, >>>with my understanding of XSL, I couldn't figure out a good way to do this. >>> >>>The normal ways, I can think of, to do this are either: >>> >>>1. XSL supports conditional actions, such as: >>> >>> if condition >>> >>> else >>> do nothing >>> end >>>(I don't believe XSL has such a feature, unless I overlooked something. >The >>>only thing I see similar is the attribute rules but it can only be >>>'=Value'), or >>> >>>2. in the scripts, one can write: >>> function filter (e) { >>> if calc(attribute) then >>> do something >>> >>> else >>> do something else >>> end >>> and in XSL body, one can use the following rules to filter output >>>without skipping (or including all) its descendants: >>> >>> filter(this) >>> >>>There is a "hacking" way I can think of, but unfortunately, I can't test >it >>>because MSXSL doesn't seem to support 'mode' yet. Anyways, the 'hacking' >>I'm >>>trying to do it to use something like: >>> function check(e) { >>> ...... >>> } /* return true or false */ >>> ....... >>> >>> >>> >>> >>> >>> >>> ......... >>> >>> >>> >>> ...... >>> >>> >>>I'm not sure if this works or not (it doesn't work with current MSXSL, at >>>least). >>> >>>Can any knowledgeable people help me clarify the followings: >>> >>>1. Does XSL supports either the above 2 scenarios (i.e. condition in >action >>>rules or processing descendants in scripts). I may overlook something. >>> >>>2. If neither of above is available in XSL, what are the good work around? >>>Will my 'hacking' work? >>> >>>Conditional action is a common feature. Very often an SGML/XML application >>>needs to filter certain elements based on calculation (of attribute >>values). >>>IF XSL can't support this, I would say something essential is missing from >>>XSL. >>> >>>Any comments? >>> >>>Thanks, >>>Min >>> >> >> >> > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Jan 28 14:06:58 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE In-Reply-To: <199801281228.XAA00154@jawa.chilli.net.au> Message-ID: <199801281405.AA00008@murata.apsdc.ksp.fujixerox.co.jp> Rick Jelliffe writes: >Does anyone know what cunning plans the W3C font service people are up to? >>From my glance they seem to be looking for a way to distribute (= sell?) >fonts over the web, while we are looking at how to get a single glyph. >Anything XML does should fit in with them, I suppose. > >Gavin Nicol started a group on this issue a little while back, but I think >it was derailed by more immediate issues. > >Certainly I think a PI system that "shortrefs" user-defined characters >to Xpointers that index into a font held remotely would be the best way. >Anyone like that idea? If you create a gaiji (non-standard "character") and separate the declaration data of your gaiji from your document, your document will eventually become undisplayable. The declaration data is very likely to be lost. In a Japanese project called Tron, the declaration data is directly embedded within the document that references to it. Probably, this is the safest approach. Your document is guaranteed to be displayable. Moreover, this approach does not lead to serious dispute. If some mechanism for registering "characters" is made publicly available, I am sure that some character-policemen will strongly oppose. In the context of XML, probably we need a mechanism to reference to some portion of an external parsed entity, which provides a collection of fonts. This is very close to what you proposed, if not identical. I do not know if there is a good match between this approach and WebFont of W3C. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Jan 28 14:12:04 1998 From: ht at cogsci.ed.ac.uk (Henry Thompson) Date: Mon Jun 7 17:00:00 2004 Subject: Conditional actions in XSL? In-Reply-To: Peter Murray-Rust's message of Wed, 28 Jan 1998 07:53:14 References: <3.0.1.16.19980128075314.31f79096@pop3.demon.co.uk> Message-ID: How to filter children by processing in XSL: It's not as clean as we would like as things stand. The following (untested) is messy but should work with XSLJ: var thresholdDate="1/1/94"; function myFilterPred(node) { return dateGreater(node.entrydate,thresholdDate); }; function dateGreate(d1,d2) { . . . } . . . some pattern . . . if (myFilterPred(this)) { return withMode(OK,processNodeList(this)); } else { return ; }; Message-ID: In message <3.0.1.16.19980128075314.31f79096@pop3.demon.co.uk>, Peter Murray-Rust writes >>>>Can any knowledgeable people help me clarify the followings: >>>> >>>>1. Does XSL supports either the above 2 scenarios (i.e. condition in >>action >>>>rules or processing descendants in scripts). I may overlook something. >>>> >>>>2. If neither of above is available in XSL, what are the good work around? >>>>Will my 'hacking' work? I don't know if a knowledgeable person can help, but meanwhile I'll have a go ... The conditions for XSL are set by the structure that you specify, e.g.: defines the target elements for a rule as being those elements which are children of an "object" element, and which have a value set for their "REND" attribute. If the condition matches for an element, the rule fires. This is a pretty powerful mechanism, although it doesn't do the 'else' case you mention - that has to be another rule (perhaps a more general rule - only the most 'specific' rule (3.2.6) applies to each element). I'm not sure about the ability to invoke a bit of script when testing the attribute value. The XSL spec allows you to invoke ECMAscript when _setting_ attribute values on (output) flow objects - there is no reason at all why it couldn't allow the same feature when _testing_ attribute values on source elements. (Obviously in this case the script would have to return true/false rather than a string.) When trying out a bit of code to show how easy this all is, I couldn't get MSXSL to take any notice of an statement inside a as per the above example. But I think it should have done! One point is that the spec uses both 'name="XXX"' and 'type="XXX"' to qualify in its examples in 3.2.3. I think, though, that the 'type="XXX"' is an error. (Neither worked with MSXSL.) Any more knowledgeable people out there? Richard. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Jan 28 15:08:41 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE In-Reply-To: <199801281445.OAA10471@nathaniel.eps.inso.com> Message-ID: <199801281508.AA00010@murata.apsdc.ksp.fujixerox.co.jp> In message "Re: SDATA or UNICODE", Gavin Nicol wrote... > I was hoping to create a repository of characters and glyphs, with > mapping between them so that we could guarantee interoperability. Characters are hard. The committee for JIS X0208:1997 spent enourmous amount of time for each character in JIS X0208. They apparently believe that the registration of new "characters" for different appearance is harmful. [Thu, 29 Jan 1998 00:01:51 +0900] Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Jan 28 15:11:28 1998 From: ht at cogsci.ed.ac.uk (Henry Thompson) Date: Mon Jun 7 17:00:00 2004 Subject: Conditional actions in XSL? In-Reply-To: Richard Light's message of Wed, 28 Jan 1998 14:27:18 +0000 References: Message-ID: Richard Light writes: > The conditions for XSL are set by the structure that you specify, e.g.: > > > > > > > > . . . > > I'm not sure about the ability to invoke a bit of script when testing > the attribute value. The XSL spec allows you to invoke ECMAscript when > _setting_ attribute values on (output) flow objects - there is no reason > at all why it couldn't allow the same feature when _testing_ attribute > values on source elements. (Obviously in this case the script would > have to return true/false rather than a string.) You're right, in XSL as currently proposed you can't do that, but it is consistent with the existing (but unimplemented) part of the DSSSL model of construction rules via the 'query' construction rule, so not out of the question in principle. > When trying out a bit of code to show how easy this all is, I couldn't > get MSXSL to take any notice of an statement inside a > as per the above example. But I think it should have > done! > > One point is that the spec uses both 'name="XXX"' and 'type="XXX"' to > qualify in its examples in 3.2.3. I think, though, that the > 'type="XXX"' is an error. (Neither worked with MSXSL.) 'name=...' is correct for the 'attribute' element type. I haven't tried MSXSL much, but their examples show patterns which use it, so I'm surprised, because I can't make it work either. ht -- Henry S. Thompson, Human Communication Research Centre, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.cogsci.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pierlou at CAM.ORG Wed Jan 28 15:21:58 1998 From: pierlou at CAM.ORG (Pierre Morel) Date: Mon Jun 7 17:00:00 2004 Subject: XML as a Programming Language Message-ID: <34CF4BB3.70C112AC@cam.org> I am happy to share my works base on XML as a programming language. The model for now can support events attached to almost any elements. If you are interest in this kind of development, there is more info available at http://www.cam.org/~pierlou/prototype/ Pierre Morel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 28 15:52:32 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:00 2004 Subject: JUMBO9801 alpha release Message-ID: <3.0.1.16.19980128130656.2ecfa172@pop3.demon.co.uk> An alpha release of JUMBO9801-core is available; details are in the copy of index.html reproduced below. JUMBO is SAX-compliant and I would be particularly interested in your experiences in installing it in that manner. I have deliberately not included any SAX material or other parser-writer's material [1]. Documentation is sparse, but being produced (in XML, of course) to be fully integrated with the menu system. One major change is the componentisation of JUMBO - this should aid in downloading (which used to be a problem) - the tgz is now 350 Kbytes which I hope is manageable. I would - as always - be very grateful for anyone who wishes to mirror JUMBO resources. Please note that JUMBO is NOT a specialist browser for Chemical Markup Language [I'd be grateful for changes in current pointers which imply this :-)]. It is a generic element-based XML tool, and will make a readable, if not beautiful, job of rendering Lark.xml, Julius Caesar, etc. It can be extended via JUMBO-FOOML to support FOOML. As one example of FOOML, JUMBO-CML will be released in a few days (see parallel message) which DOES provide specialist support for Chemical Markup Language :-) I'd be extremely grateful for feedback - this is "alpha". There is a list of known deficiencies in the distribution. The important thing at this stage is not that it runs well, but that it runs at all... For example I know that refreshing graphics on a Sirius Cybernetics machine is *extremely* slow P. -------------------------------JUMBO9801a index.html----------------------- Jumbo9801a This document describes the alpha "snapshot" (i.e. release) of JUMBO in Jan 1998. Description JUMBO is an element-oriented system for processing XML documents. It can read and parse (with/without additional parsers, with/without the SAX interface). It creates a tree or elements and attributes with various types of content. It also supports processing instructions (PIs) in a generic manner. There is support for namespaces and XSL stylesheets, though JUMBO does not have sophisticated rendering. It has a browsing model based on a tree/TOC model, event streams or customised element display. It supports (SIMPLE) XLL navigation including NEW and REPLACE and most Xpointer syntax. It extends the latter to provide sophisticated search and navigation tools for the document. JUMBO also provides authoring and editing facilities, driven by DTD information where possible. These can be customised to provide novel types of data input other than text. JUMBO is designed to be extended, especially through subclassing or elements, and I hope that a collaborative community (cf. tcl/tk, LaTeX, Linux) will develop for its future support. Offers are very welcome here. Main Features JUMBO is 100% pure Java (1.02) and runs as an applet or application. JUMBO does not knowingly deviate from the X*L specs, apart from known limitations. JUMBO has an elementary XML parser, sufficient for its own configuration files. JUMBO has been developed to be used with the SAX API so that any SAX-J-compliant parser (1998-01-28: AElfred, Lark, MSXML, NXP, (XP not yet done)) can be used at runtime. Parsers can be selected in the commandline or through menus. The parse result is treated as a tree and displayed on a tableOfContents (TOC). This allows access to all main components (elements, attributes, content, PIs) Components are rendered as: subtrees/TOCs; event streams (text, tagged text and others); and individual objects. Fonts can be selected. JUMBO menus are driven by (internal) XML documents which include HTML-based help on a per-item basis The JUMBO GUI has several components allowing assessment of the document and its processing including error announcement (Draconian). Xpointers (XLL) are implemented for: linking into subcomponents of XML documents; searching XML documents; internal management of XML documents (e.g. menus, stylesheets, namespaces) User-based searching is through an interface which allows boolean combination of strict XPointer addressing. Hits are highlighted in the TOC. X*L-specific tools (Find IDs, NAMEs) are included. XLL is implemented for SIMPLE. NEW and REPLACE are implemented; EMBED is on a per-application basis. AUTO and USER are implemented. (JUMBO extensions can link into non-XML documents). New XML (and non-XML files) can be read into JUMBO under menu control. The current tree (possibly modified) can be saved as XML. Window components can be saved as GIFs There are a variety of options for browsing elements, attributes, PCDATA and whitespace. Two display options (TOC and TOC+object) can be chosen - more will follow. Objects can be displayed in individual Frames. JUMBO allows import of non-XML documents by setting MIME types and requiring per-MIME conversion code. The conversion is done on-the-fly. JUMBO supports some non-SAX information on a per-parser basis. This includes DTD components such as ATTLISTs and ELEMENT contentDeclarations JUMBO can be used for editing existing documents, sometimes with primitive DTD or schema-based control. It can also be used for creating new documents. JUMBO has an experimental approach towards namespaces Stylesheets: JUMBO is tracking the public XSL spec and can read XSL documents JUMBO is easily extended to provide support for Java-based applications on a package/namespace basis. The following are currently available: jumbo.sgml.html (HTML V2.0). This supports well-formed HTML at about V2.0 level (but no tables or forms). Rendering is readable but not optimised for performance or beauty. JUMBO-HTML is included in the alpha distribution. jumbo.tecml (Technical Markup Language). This is aimed at technical and scientific applications provides strong data typing (FLOAT, DATE, etc.) with UNITS and structuring (ARRAY and LIST). Some commonly used data types are also included: BIB, PERSON, FIGURE, etc. NOT included in alpha distribution jumbo.cml (Chemical Markup Language). This provides support for molecular applications. NOT included in alpha distribution jumbo.chemime (Chemical MIME). Classes to convert non-XML files (chemical/x-*) into XML trees on the fly. NOT included in alpha distribution jumbo.vhg (Virtual HyperGlossary). Support for XML-based terminology including ISO12620 terms. NOT included in alpha distribution Installation JUMBO9801a is available at http://www.vsms.nottingham.ac.uk/vsms/java/jumbo/jan9801. Details of installation are available; it will be useful to install one or more SAX-compliant (http://www.microstar.com/xml/sax) parsers. Copyright, Collaboration, Source, Warranty JUMBO is copyright Peter Murray-Rust. It is available without fee, but may not be redistributed or used for commercial purposes or teaching without permission. It is my intention that JUMBO is freely available for personal use by individuals and for personal use within organisations at present. Class libraries will be available on the WWW. I hope to develop a LaTeX/tcl-like club of collaborators and the precise nature of future copyright will depend on that; I would like to relax the restrictions above. I am reluctant at present to make source freely available except to collaborators since (through experience) I fear the distribution of mutants and the misappropriation of authorship. Constructive suggestions would be welcomed here. If a stable core can be communally developed (like tcl) I would feel more relaxed. So, if you are seriously interested in helping give me a mail with details. No guarantee is made of JUMBO's fitness for any purpose and the author is not responsible for any damage caused by whatever means. Copyright Peter Murray-Rust, 1996, 1997, 1998 [1] I have included Lark.xml and saxpec.html in the distribution - I hope the authors don't mind :-) Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 28 16:16:25 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:00 2004 Subject: XML and the launch of Chemical Markup Language Message-ID: <3.0.1.16.19980128150718.36a792b8@pop3.demon.co.uk> I have been invited to give a virtual lecture by VEI Ltd and Chemweb Ltd and I have taken the opportunity to "launch" Chemical Markup Language and also to promote the use of XML. Details are at: http://chemweb.vei.co.uk [Time: 1998-02-04:16:00 UTC. Free registration. Technology required: any HTML browser, preferably with ECMAScript (= JavaScript). No real-time audio/video required; all text-based like MOOs] The first lecture in this series was given by Henry Rzepa and attracted several molecular scientists worldwide. I have taken this as a timely opportunity to promote XML, given that (hopefully) there should be a positive public announcement at about that time :-) The format of Henry's lecture was that he presented a number of slides and "spoke" to them. A number of panellists were present to comment. The session then developed to communal discussion on the issues raised, the whole proceedings taking about 2 hours. I have decided to present more slides, but would be grateful for authoritative support on XML. In general the message I want to promote is: - XML is going to happen in a big way - XML is the only reasonable way for documents/data over the WWW (OK, I also believe in CORBA but it's not so amenable) - other disciplines are making serious headway with XML; molecular science needs to catch up. - There will be lots of XML tools very shortly - XML can do things beyond our current imagining. (For example the virtual lecture could be given in XML). If any seasoned XMLers would like to 'come' then I would be grateful for contributions. Unless you are a pan-dimensional hyperbeing it probably takes most of your attention for that period, though some people can multitask. Outwards communication is through scrolling text; inwards is through HTML-based forms. There is a "room" for invitees who wish to make comments and I would be grateful for volunteers :-) Whatever else it's fun, and it's a step towards the future. I shall be releasing the "current" CML DTD, which is now fairly lean and clean and intentionally flexible. Also I plan to have snapshots of JUMBO and JUMBO-CML in action although I do NOT intend to run these as live applets :-) I have taken the XML-WG's goals as a starting point for CML's goals and I'd also value comments on them when they are released. If you cannot 'come' to the lecture, the transcript and slides will be available and there will be an ongoing virtual discussion P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 28 16:19:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:00 2004 Subject: SAX: Status, Parsers, and Clients Message-ID: <199801281614.LAA01371@unready.microstar.com> I've just added JUMBO to the list of clients supporting SAX. In Java, there are now five XML parsers with SAX support available and four publically-announced SAX clients (that makes twenty possible client-parser combinations, according to my arithmetic). For details, see http://www.microstar.com/XML/SAX/applications.html While I'm thrilled with the response, I also want to remind everyone that the SAX interface is not yet finalised, and all drivers and clients will need to be modified (hopefully only slightly) to support any changes that we agree on for the final interface. I hope to be through my crunch in a week or two, then I'll summarise all of the changes that have been proposed to start off the next discussion. I've also been doing more work trying to fix the MSXML SAX driver further -- it's been by far the most difficult to implement (see my comments in the MSXMLDriver.java source file for details), and the one I released yesterday still has problems. I'll release my newest version in a day or two. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Wed Jan 28 16:26:36 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE In-Reply-To: <199801281612.QAA10546@nathaniel.eps.inso.com> Message-ID: <199801281626.AA00011@murata.apsdc.ksp.fujixerox.co.jp> In message "Re: SDATA or UNICODE", Gavin Nicol wrote... > As such, you need some > mechanism allowing people to use "non-standard" characters, > such as Gaiji, mathematic symbols etc. and to allow systems to > process them in an interoperable manner. Probably, it is safe to say that such mechanisms are not for characters and that they merely handle data for providing some physical appearance. (I am trying to avoid words such as glyph, font, zikkei, etc. They are dangerous here.) Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tslee at sybase.com Wed Jan 28 19:10:54 1998 From: tslee at sybase.com (Tom Slee) Date: Mon Jun 7 17:00:00 2004 Subject: Resource Description Format and XML-Data Message-ID: <8525659A.00658BD5.00@gwwest.sybase.com> I am confused about the intent and overlap of the XML-Data and RDF initiatives. It seems to me that these tackle very similar problems (that is, using XML to deliver structured data rather than to deliver documents), in an similar manner, but that they are independent initiatives and contradict each other. Could somebody please clarify the role these two initiative play? Are they complementary or are they competing initiatives? Tom Slee tslee@sybase.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Jan 28 19:20:32 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:00 2004 Subject: Resource Description Format and XML-Data Message-ID: <3.0.32.19980128112134.00aaef68@pop.intergate.bc.ca> At 01:36 PM 28/01/98 -0500, Tom Slee wrote: >I am confused about the intent and overlap of the XML-Data and RDF >initiatives. It seems to me that these tackle very similar problems (that >is, using XML to deliver structured data rather than to deliver documents), >in an similar manner, but that they are independent initiatives and >contradict each other. I think this perception is justified, and share in the confusion. Note that XML-Data contains a DTD replacement facility which seems oblivious to the existence of the RDF and, most surprisingly, the RDF Schema groups in the W3C Metadata activity. This is most surprising since the group which authored XML-Data is heavily represented in both these efforts. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From serres-doug at usa.net Wed Jan 28 19:58:13 1998 From: serres-doug at usa.net (Doug Serres) Date: Mon Jun 7 17:00:00 2004 Subject: XML and the launch of Chemical Markup Language References: <3.0.1.16.19980128150718.36a792b8@pop3.demon.co.uk> Message-ID: <34CF8DE2.F594AE77@usa.net> Peter Murray-Rust wrote: > - XML is the only reasonable way for documents/data over the WWW (OK, I > also believe in CORBA but it's not so amenable) We at Andyne are using CORBA as the intercommunication protocol and XML to define the data! Peter Murray-Rust wrote: -- Doug Serres Junior Developer - R&D Andyne Computing Ltd. e-mail: dserres@andyne.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 28 20:41:24 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:00 2004 Subject: First try with Jumbo In-Reply-To: <199801281732.MAA03693@unready.microstar.com> Message-ID: <3.0.1.16.19980128202649.4ba7dadc@pop3.demon.co.uk> At 12:32 28/01/98 -0500, [a correspondent] wrote [privately] : >Peter: > >I've just tried Jumbo (9801) on my system (Linux 2.0.x and JDK 1.1.3), >with the following command line from the installation directory: > > java jumbo.sgml.Jumbo scene1.sgm > >The program tries to load a few files from absolute paths, then fails >angrily. Here are the first three: > >- "/jumbo/mimetypes.xml" instead of "jumbo/mimetypes.xml" >- "/jumbo/sgml/html/schema.xml" instead of > "jumbo/sgml/html/schema.xml" >- "/scene1.sgm" instead of "scene1.sgm" > This is the sort of thing that saps enthusiasm in the middle of the night :-) (Actually it's still early evening). I have had had *awful* problems with files/URLs under 1.02 (and I don't know it gets much better later). I thought I'd fixed this one. The problem is that not all interpreters/virtual machines seem to behave consistently. I started development with (javac,java) under Solaris; then when I moved to W95 ((javac|jvc)|(java|jview) problems arose. java/jview do not behave consistently - some have backslashes, other have slashes in files/URLs. Another problem is that java.net.URL(URL, String) seems to have bugs in some implementations when URL is the current directory (I think this is where the leading slash comes from). I *thought* I'd fixed it. I'll try to revisit it, though I don't have a local UNIX system which makes it tricky. I will have to wake one up. The number of possible combinations I have to test is: (java95/jview95/Netscape95/MSIE95/javaUX/NetscapeUX/MSIEUX/javaMAC/NetscapeM AC)*(jvc/javac). There are also different flavours of UNIX and some have very poor performance. This is a write-once, debug-many problem. I suggest the following: - if you have W95, then it should behave as I have suggested - if you have UNIX, try: - using full URLs instead of scene1.sgm (I think it should work without the ancillary files) - or load jumbo without a file on the command line and use File | "Open XML" to read in scene1.sgm - or try from a different directory. ("." in the classpath may cause problems) Performance note. JUMBO reads Lark.xml fine, but struggles on my PC with pr.xml (the PR in xml). This fails (StackOverflowError) with java, but works with jview. The navigation is then not too bad. It takes about 15 secs to find all given element types in the document (i.e. tree traversal). None of this is optimised. So - at present - the effective limit of documents is about 100K I suspect (though it's probably more dependent on the node count). This is the price paid at present for making every node displayable, editable, reconfigurable, etc. On a constructive note - I suspect we shall need some sort of configuration file for this. JAR files should help with this sort of thing? P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Jan 28 21:05:27 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:00 2004 Subject: Resource Description Format and XML-Data Message-ID: <006a01bd2c2f$c9d66d00$2ee044c6@donpark> >Note that XML-Data contains a DTD replacement facility which seems >oblivious to the existence of the RDF and, most surprisingly, the >RDF Schema groups in the W3C Metadata activity. This is most surprising >since the group which authored XML-Data is heavily represented in both >these efforts. This is not surprising if you assume that Microsoft invested lots of resources to make their entire database product line work with XML-Data. >From my past experiences with them, they will probably hit with XML-Data support in Microsoft Access and then move on to ODBC, OLE DB, and so on. It will hurt them to use RDF because of time and because competitors of MS are going through RDF route. If MS slows down RDF approval process, MS gains big and competitors will have wasted lots of money going after RDF. Anyway, I am not for MS nor against MS. I just think they are making a brilliant strategic move despite the fact that MS is moving right toward where I am (yes, I am in the XML-Database arena). You can be an ant and still enjoy the thrill of jousting with an elephant. I am looking forward to it, matter fact... Don Park Developer/Consultant http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gmckenzi at JetForm.com Wed Jan 28 21:15:28 1998 From: gmckenzi at JetForm.com (Gavin McKenzie) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: On this issue of accessing characters that aren't in Unicode... XML provides a way for specifying the encoding of an entity with the ?XML pi encoding declaration. Why wouldn't this be sufficient. If the euro or florin symbol is available in some non-Unicode character encoding scheme, isn't it sufficient to encode the text which requires the symbol in the appropriate scheme and use the encoding declaration? On a related note...I have felt that it should be possible to attach the encoding declaration to any element in a manner similar to xml:lang. Typically our customers (who often are not able to make use of Unicode) require the ability to switch from one character encoding scheme to another on the fly within the same physical document (e.g. switching from Shift-JIS to Latin-1 and back). Referencing an external entity makes it possible, but not acceptable for our customers. Methinks that xml:lang points out this need. Have I missed something in the spec that permits me to do this? Gavin. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Jan 28 21:21:57 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:00 2004 Subject: Status, Parsers, and Clients Message-ID: <00a301bd2c32$1a0b7f70$2ee044c6@donpark> David, >While I'm thrilled with the response, I also want to remind everyone >that the SAX interface is not yet finalised, and all drivers and >clients will need to be modified (hopefully only slightly) to support >any changes that we agree on for the final interface. I hope to be >through my crunch in a week or two, then I'll summarise all of the >changes that have been proposed to start off the next discussion. I am prepare to promptly update SAXDOM whenever SAX or DOM spec changes. >I've also been doing more work trying to fix the MSXML SAX driver >further -- it's been by far the most difficult to implement (see my >comments in the MSXMLDriver.java source file for details), and the one >I released yesterday still has problems. I'll release my newest >version in a day or two. Since Chris seems to be buried in snow, I can take over maintenance of the MSXML driver after you release the newest version. When everything is stable, I'll ask Chris (nicely) to put it in the next MSXML parser package. Keep going David, we are right behind ya. Don Park http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Wed Jan 28 21:26:16 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 17:00:00 2004 Subject: vCard DTD References: <199801270112.UAA15449@geode.ora.com> Message-ID: <34CF5DC6.676750F3@essc.psu.edu> Chris Maden wrote: > > > Then I think I'd recommend two different kinds of names, one with > content model (#PCDATA) and the other with (firstname,surname,...). > Then use (struct-name|free-name) in the parent content model. > > -Chris > I've a working page at "http://www.pasda.psu.edu/cgi-bin/pasda/xcards/contacts.cgi". I think this is going to work very well! Jason Cupp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Jan 28 21:34:58 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:00 2004 Subject: XML and the launch of Chemical Markup Language In-Reply-To: <3.0.1.16.19980128150718.36a792b8@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980128202822.4ba7bb9e@pop3.demon.co.uk> At 15:07 28/01/98, Peter Murray-Rust wrote: Oh dear! > >The first lecture in this series was given by Henry Rzepa and attracted >several molecular scientists worldwide. I have taken this as a timely This was meant to read "several hundred" :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rdaniel at lanl.gov Wed Jan 28 21:50:17 1998 From: rdaniel at lanl.gov (Ron Daniel Jr.) Date: Mon Jun 7 17:00:00 2004 Subject: Resource Description Format and XML-Data Message-ID: <3.0.32.19980128144227.00a5d580@cic-mail.lanl.gov> There certainly seems to be a lot of overlap. One of the problems of RDF is that the serialization syntax is rather verbose. What I would like to see is the W3C forming a group to specify XML-Data in such a way that schemas can define a mapping from compact XML syntax to RDF data models (and back?). Ron Daniel Jr. voice:+1 505 665 0597 Advanced Computing Lab fax:+1 505 665 4939 MS B287 email:rdaniel@lanl.gov Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel Los Alamos, NM, USA, 87545 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Jan 28 22:11:46 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: <3.0.32.19980128141305.00a873e0@pop.intergate.bc.ca> At 04:10 PM 28/01/98 -0500, Gavin McKenzie wrote: >On this issue of accessing characters that aren't in Unicode... > >XML provides a way for specifying the encoding of an entity with the >?XML pi encoding declaration. Why wouldn't this be sufficient. If the >euro or florin symbol is available in some non-Unicode character >encoding scheme Good idea, but it doesn't quite work. XML is very rigid in saying that all the characters have to be Unicode characters (which the Euro is quickly becoming). So let's take for an example the current identifier of The Artist Formerly Known As Prince. Even if I have an encoding in which this is available, say at code point 12352, that doesn't make it into a Unicode character, or usable in XML. Non-Unicode *encodings* are OK (e.g. ASCII). Non-Unicode *characters* aren't. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 28 22:14:08 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:00 2004 Subject: First try with Jumbo In-Reply-To: <3.0.1.16.19980128202649.4ba7dadc@pop3.demon.co.uk> References: <199801281732.MAA03693@unready.microstar.com> <3.0.1.16.19980128202649.4ba7dadc@pop3.demon.co.uk> Message-ID: <199801282209.RAA05998@unready.microstar.com> Peter Murray-Rust writes: > I have had had *awful* problems with files/URLs under 1.02 (and I don't > know it gets much better later). I thought I'd fixed this one. The problem > is that not all interpreters/virtual machines seem to behave consistently. > I started development with (javac,java) under Solaris; then when I moved to > W95 > ((javac|jvc)|(java|jview) problems arose. java/jview do not behave > consistently - some have backslashes, other have slashes in files/URLs. > Another problem is that java.net.URL(URL, String) seems to have bugs in > some implementations when URL is the current directory (I think this is > where the leading slash comes from). I *thought* I'd fixed it. I'll try to > revisit it, though I don't have a local UNIX system which makes it tricky. > I will have to wake one up. You can pretty much always use forward slashes (solidus) in file URLs -- I've had success with both jview and the JDK. Here's the routine that the latest version of SAXDemo uses to construct an absolute URL when necessary (James helped out with some changes): /** * If a URL is relative, make it absolute against the current directory. */ private static String makeAbsoluteURL (String url) throws java.net.MalformedURLException { URL baseURL; String currentDirectory = System.getProperty("user.dir"); String fileSep = System.getProperty("file.separator"); String file = currentDirectory.replace(fileSep.charAt(0), '/') + '/'; if (file.charAt(0) != '/') { file = "/" + file; } baseURL = new URL("file", null, file); return new URL(baseURL, url).toString(); } > The number of possible combinations I have to test is: > (java95/jview95/Netscape95/MSIE95/javaUX/NetscapeUX/MSIEUX/javaMAC/NetscapeM > AC)*(jvc/javac). I expected the same with AElfred, but was surprised to find that it wasn't the case -- the difference, I think, is that I wasn't working with a GUI. > - using full URLs instead of scene1.sgm (I think it should work without > the ancillary files) I tried that, but without luck. > - or load jumbo without a file on the command line and use File | "Open > XML" to read in scene1.sgm I get a null URL error in a popup dialog (but see below). > - or try from a different directory. ("." in the classpath may cause > problems) The last two together work -- if I give an absolute file URL _and_ start outside of JUMBO's home directory, I can view my document. Nice browser. Just a minor note -- I get a class cast error when I try to enable "Show #PCDATA". All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Jan 28 22:21:15 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE In-Reply-To: References: Message-ID: <199801282217.RAA06080@unready.microstar.com> Gavin McKenzie writes: > On a related note...I have felt that it should be possible to attach the > encoding declaration to any element in a manner similar to xml:lang. > Typically our customers (who often are not able to make use of Unicode) > require the ability to switch from one character encoding scheme to > another on the fly within the same physical document (e.g. switching > from Shift-JIS to Latin-1 and back). Referencing an external entity > makes it possible, but not acceptable for our customers. NOTATION attributes could be used for transliteration schemes, but I don't know if they could/should be used to shift encoding. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Jan 28 22:28:31 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE Message-ID: <3.0.32.19980128142928.00a86b98@pop.intergate.bc.ca> At 04:10 PM 28/01/98 -0500, Gavin McKenzie wrote: >On a related note...I have felt that it should be possible to attach the >encoding declaration to any element in a manner similar to xml:lang. >Typically our customers (who often are not able to make use of Unicode) >require the ability to switch from one character encoding scheme to >another on the fly within the same physical document (e.g. switching >from Shift-JIS to Latin-1 and back). Referencing an external entity >makes it possible, but not acceptable for our customers. That's a reasonably obvious idea, and we kicked it around. There are some real problems with it, including increased complexity in buffering schemes, and the fact that you can no longer generate an accurate MIME header. So for 1.0, this is just not on. >Methinks that xml:lang points out this need. Have I missed something in >the spec that permits me to do this? No. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Jan 28 22:30:18 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:00 2004 Subject: SDATA or UNICODE In-Reply-To: Message-ID: On Wed, 28 Jan 1998, Gavin McKenzie wrote: > > XML provides a way for specifying the encoding of an entity with the > ?XML pi encoding declaration. Why wouldn't this be sufficient. If the > euro or florin symbol is available in some non-Unicode character > encoding scheme, isn't it sufficient to encode the text which requires > the symbol in the appropriate scheme and use the encoding declaration? No, for the reason Tim points out. On the other hand, you might be on the right track. A processing instruction would serve as a hack to tell the application where to insert the euro. > On a related note...I have felt that it should be possible to attach the > encoding declaration to any element in a manner similar to xml:lang. > Typically our customers (who often are not able to make use of Unicode) > require the ability to switch from one character encoding scheme to > another on the fly within the same physical document (e.g. switching > from Shift-JIS to Latin-1 and back). Referencing an external entity > makes it possible, but not acceptable for our customers. Egad. This is one of those things that is a good idea at the user level, but would make implementation prohibitive. Imagine the poor desperate Python hacker trying to "grep" through that! I think you should implement a language that allows this and is preprocessed into XML. If I were you I would use marked sections and not attributes to describe the boundaries. Marked sections are really easy to scan for. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Jan 28 22:41:38 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:01 2004 Subject: First try with Jumbo -- creating URL to files Message-ID: <000901bd2c3d$36289b60$2ee044c6@donpark> Here is what I use: package com.jstud.util; import java.io.File; import java.net.URL; import java.net.MalformedURLException; // // // NetUtil // // public class NetUtil { public static URL createFileURL (File file) { return createFileURL(file.getAbsolutePath()); } public static URL createFileURL (String path) { URL url = null; try { // This is a bunch of weird code that is required to // make a valid URL on the Windows platform, due // to inconsistencies in what getAbsolutePath returns. String fs = File.separator; if (fs.length() == 1) { char sep = fs.charAt(0); if (sep != '/') path = path.replace(sep, '/'); if (path.charAt(0) != '/') path = '/' + path; } path = "file://" + path; url = new URL(path); } catch (MalformedURLException e) { } return url; } } Don Park http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony.stewart at rivcom.com Wed Jan 28 23:26:42 1998 From: tony.stewart at rivcom.com (Tony Stewart) Date: Mon Jun 7 17:00:01 2004 Subject: Conditional actions in XSL? Message-ID: <4955E202FE46D11195C500609712EB6B0574F7@FLPS-NTSERVER1> Henry Thompson writes: Richard Light writes: > The conditions for XSL are set by the structure that you specify, e.g.: > > > > > > > > . . . > > I'm not sure about the ability to invoke a bit of script when testing > the attribute value. The XSL spec allows you to invoke ECMAscript when > _setting_ attribute values on (output) flow objects - there is no reason > at all why it couldn't allow the same feature when _testing_ attribute > values on source elements. (Obviously in this case the script would > have to return true/false rather than a string.) You're right, in XSL as currently proposed you can't do that, but it is consistent with the existing (but unimplemented) part of the DSSSL model of construction rules via the 'query' construction rule, so not out of the question in principle. The more XSL allows us to call scripts at _any_ stage of the process, the better. Most of the action of delivering XML documents over the web will take place when you apply the style rules to the text. Many of the things our clients are asking us to do with style rules, such as applying formatting or triggering browser behavior based on combinations of document context and conditions outside of the document (who is the user? which option did she select two pages back? is the pump running hot right now?) require pretty serious use of scripts to fire external queries and set/get memory variables.* To the extent that XSL contains non-critical restrictions on when we can make those calls and what we can do with them, I'd like to see those restrictions removed. *(Though an alternative to maintaining traditional memory variables could be to manipulate attributes of the parsed tree, provided that the scripts were allowed to get at it... style rules manipulating the DOM... there's food for thought.) Btw, any chance of removing the requirement that we must use ECMAScript as the first layer we shell out to? I'm not clear what the standard gains by this. I'd rather see a mechanism by which we declare what language we're shelling out to, like declaring an encoding, or perhaps some form of standardized API. (I'm new to this thicket of standardization issues so don't want to push specific suggestions too hard; just looking for less restrictive options than always using ECMAScript.) Regards, Tony Stewart RivCom "Publishing Structured Information" tony.stewart@rivcom.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 29 00:35:28 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:01 2004 Subject: Conditional actions in XSL? Message-ID: <3.0.32.19980128183244.00b383bc@swbell.net> At 11:24 PM 1/28/98 -0000, Tony Stewart wrote: > ... require pretty serious use of scripts to fire external >queries and set/get memory variables.* To the extent that XSL contains >non-critical restrictions on when we can make those calls and what we >can do with them, I'd like to see those restrictions removed. > >*(Though an alternative to maintaining traditional memory variables >could be to manipulate attributes of the parsed tree, provided that the >scripts were allowed to get at it... style rules manipulating the DOM... >there's food for thought.) In the grove-based world of HyTime and DSSSL, the outside is represented by constructing a grove from it and then interrogating that grove. For example, I could define a property set that describes the objects I care about in my system, including the states of various dynamic properties (hmmm, sounds like an Express data model....) and then write software that constructs a grove from those objects. Given a grove, I can then use normal grove processing to act on it (e.g., a DSSSL or XSL spec). Queries are represented in the same way: the result of a query is always nodes in a grove--any grove, it doesn't matter, as long as its a grove. For example, if I want to be able to query the mouse pointer, I might define a property set like: X coordinate of mouse pointer Y coordinate of mouse pointer Up or down state of button 1. True=down, False=up Up or down state of button 2. True=down, False=up Now when I query the mouse, I construct a grove of one node with the four properties shown. I can then iterrogate those properties using my DSSSL spec: (define query-mouse (external-procedure "UNREGISTERED::DRMACRO//Procedure::query-mouse")) (define (mouse-b1-down?) (if (node-property 'button1 (query-mouse)) (make paragraph (literal "Mouse button 1 is down")) (make paragraph (literal "Mouse button 1 us up")))) Where "query-mouse" is my mouse state grove constructor. Now, every time my master processor applies this style spec, the result will reflect the state of the mouse. If the style is re-applied as quickly as possible, then I have what appears to be an "interactive" document (when what I really have is a very responsive presentation system). Clearly you could model everything that user interface systems like VB and PowerBuilder do using this approach, although I don't know if it would provide any benefits (it might, who knows?). Of course, how much difference is there between interpreting VB code and applying style sheets like the above in a tight processing loop? None, I would think, except that the grove-based approach may have an additional layer of abstraction that either makes new things possible or slows the system down to unacceptable levels (or both). Note that in the grove formalism, there is no notion of "manipulating" a grove in the abstract processing model. Rather, if something changes, you construct a new grove. Of course, under the covers you probably aren't really completely reconstructing the grove, but to outside observers (that is, the programs operating on the groves), the grove just is. Thus, if your grove reflects the state of the system, if the system state changes, you construct a new grove to reflect the new system and reapply your processing. Obviously, if a system anticipates reacting to change, there has to be some sort of event loop that causes what was done before to be redone. Note that there's nothing that prevents some part of a style sheet from constructing a new grove. This is what happens when you use the sgml-parse function in DSSSSL--you construct a new document grove and the function returns its root. This is the functional equivalent of manipulating the grove you've got. That's why "dynamic HTML" is silly a-priori: you're not manipulating the document, you're creating a new abstract object from it. Or else you've got a script-driven editor. In either case, the dynamism isn't in the HTML, it's in the presentation processor, which is the way the world has always been. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Jan 29 03:30:35 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:01 2004 Subject: XSL/XML/XLL and VRML (was: Re: Conditional actions in XSL?) References: <3.0.32.19980128183244.00b383bc@swbell.net> Message-ID: <34CFF782.63FD@hiwaay.net> >W. Eliot Kimber wrote: > > If the style is re-applied as quickly as possible, then I > have what appears to be an "interactive" document (when what I really have > is a very responsive presentation system). > > Clearly you could model everything that user interface systems like VB and > PowerBuilder do using this approach, although I don't know if it would > provide any benefits (it might, who knows?). It does. It can do what DTDs do well: provide a precise description of the presentation style of the interface as a set of routed behaviors. Behaviors can be scripted or compiled, inlined or referenced. The best part is that at the authoring interface, the author sees the objects and simply connects them. Properties are initialized in dialogs, and voila, a Rapid IDE based on XML emerges. I was just looking at http://www.cam.org/~pierlou/prototype/ and Pierre Morrel's Prototype approach is wonderfully direct and easy to work with as a document or in created an authoring system for it. HTML is another example in how forms are done. The MID was built along the same design principles. > Of course, how much > difference is there between interpreting VB code and applying style sheets > like the above in a tight processing loop? None, I would think, except that > the grove-based approach may have an additional layer of abstraction that > either makes new things possible or slows the system down to unacceptable > levels (or both). Is that layer of abstraction necessary in a system where the XML/SGML is used primarily for properties and relationships among properties and Java is used for implementation? I look at the design of VRML by comparison (if this is screwed up, Chris Marin, feel free to pummel). It is a scene description abstraction for animating a 3D space of presentation objects and property engines. The engines that animate are timer-based. For example, a TouchSensor object (interface for mouse or what have you) emits a touchTime value to a TimerSensor object. The TimerSensor emits time values changes to a function in a presentation object (eg a Transform that contains a box primitive) which sets the value of a translation or rotation property and any children of the Transform node are rotated or moved in 3D space. A tree of transforms (group nodes really, but the tranforms are the heart of the system) is hierachically organized so that the state space reflects the spatial organization of presentation objects. That is, the transforms nest such that what is done to a parent is done to the child. Events are organized by routing among the named objects, eg, TitleTimer and Box. ROUTE BoxTimer.fraction_changed TO BoxPostitionInterpolator.set_fraction ROUTE BoxPositionInterpolator.value_changed TO Box.set_translation So statically: NamedTimer(eventOut) -> NamedPresentationType(eventIn) NamedPresentationType(eventOut) -> NamedPresentationObject(EventIn). Note the paired statements for completing a route. What would be the restatement of this relationship in an independent link? Presentations are typed. Eg, PositionInterpolator to translate, OrientationInterpolator to rotate, and so on. Values passed among the objects are typed (eg, SFFloat) and must match when routed, but this is prototyped in the object specification (eg, the VRML standard or a proto). I suppose one could use architectural forms to model this. The actual presentation engine is a method of the presentation processor. What the author is actually creating is a set of time cycle properties (fractions of the cycle)routed to a set of key frame properties (literally, positions in vector space in the example) which are then routed to the Transform properties. Moving the space moves the object in the space (eg, the box). The box as a geometric object has properties which can also be changed by routing (eg, changing color properties by routing a property set of floating point typed values, etc.). So, I *think* creating a VRML grove description is straightforward but I haven't tried so the jury is out on that. EventIn and EventOut properties have explicit directional definitions defined on the object/node in the tree. It is the object that knows how it behaves, not the link. The link or route statement connects functions to properties to organize state changes of the presentation objects. Behaviors are organized by route statements instead of function calls. In one sense, it is a syntax cheat because it is still a value/function statement, but in a practical sense it is easy to author and easy to use when extensible objects are added. Protos and external protos are used to extend the language. The application language (VRML97) provides the common format, and the implementation language/object_framework provides the inteoperability of new objects by which the language is extended. Once an object is added, it only needs to be routed. If processing of event values is required enroute, it is routed through the script node object. BTW, this is where I think we went wrong in MID. We tried to define the script language as well as the presentation objects and that proved to be ugly in SGML/HyTime and unsellable given the paucity of available implementation langauge products for that design. Java would have worked better because our requirements could have been met by dividing the responsibilities of the description lanauge and the implementation language better. > In either case, the dynamism isn't in the > HTML, it's in the presentation processor, which is the way the world has > always been. Yes, in the display system. That is why scripting presentation widgets has also always been the easiest way to author and a good DTD-centric design for prototyping dynamic documents. Routing in a VRML object set and selecting functions from a method list in an IDE of presentation objects (eg, VB, Prototype) are very similar ways to represent the same problem. The trick is the extensibility of the design. VRML is extensible without breaking the browser. It is extensible, however, not only because the presentation processor (eg, VRML plugin) is dynamic, but because the presenation plugin is operating within a defined set of services provided by the object framework (browser) for manipulating the values of events in the plugin via the API of the plugin/presentation processor. In this sense, DHTML is the shell language of a standard set of portable operating system interface objects. If one rewrites the VRML in XML, what feature of XML provides the extensibility of VRML protos? Stylesheets? OK. If one uses DSSSL/XSL, how good is that stylesheet language for describing the semantics of real time dynamic 3D simulation objects? What advantage is there in using XSL over Java to do this? Is XSL yet another scripting language? len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 29 04:07:36 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:01 2004 Subject: XSL/XML/XLL and VRML (was: Re: Conditional actions in XSL?) Message-ID: <3.0.32.19980128215032.00b01af0@swbell.net> At 09:29 PM 1/28/98 -0600, len bullard wrote: > Is XSL yet another scripting language? Yes (except that by allowing you to embed scripts, they sort of break the purity of the DSSSL/grove model). A programming language is a programming language. They only differ in what they are optimized to represent and what particular performance optimizations they allow. But Turing complete is Turing complete. DSSSL is a programming language optimized for programming the association of style with nodes in groves, but it can be used for any programming task, including driving interactive 3-D viewers, if you want to make it do that (which you very well might). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Jan 29 05:13:12 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:01 2004 Subject: SDATA or UNICODE Message-ID: <199801290515.QAA30567@jawa.chilli.net.au> > From: Paul Prescod > On Wed, 28 Jan 1998, Gavin McKenzie wrote: > > > > XML provides a way for specifying the encoding of an entity with the > > ?XML pi encoding declaration. Why wouldn't this be sufficient. If the > > euro or florin symbol is available in some non-Unicode character > > encoding scheme, isn't it sufficient to encode the text which requires > > the symbol in the appropriate scheme and use the encoding declaration? > > No, for the reason Tim points out. On the other hand, you might be on the > right track. A processing instruction would serve as a hack to tell the > application where to insert the euro. XML has, underlying its decisions, the SGML model which separates the encoding of data (i.e. "storage management") from their logical representation as streams of characters in a single character set (i.e. "entity management"). This is a very flexible model, since it allows any system of encoding that anyone can dream up to be used without having to alter XML/SGML: an entity can be sourced from files, multipart MIME, data base, random number generators, standard input, anything. To allow multiple encodings within an XML file, delimited using PIs or elements or internal entities would violate this model, and I would strongly recommend against it. If your customers require multiple encodings, then they have to source each one from a separate external entity. These entities can be bundled up or interleaved in any fashion you like, but this is a *PRE* XML storage management issue, not an XML issue. I think there is a great desire that XML will be a Trojan horse to force the development of wide-character applications, and Universal Character Set-using ones (UCS = ISO 10646 ~= Unicode) in particular. I, for one, hope that by disconnection encoding and character "repertoire", XML will marginalise the character encoding issue to the extent that it will become easier to use Unicode than to use a regional encoding, in the long run. > I think you should implement a language that allows this and is preprocessed > into XML. If I were you I would use marked sections and not attributes to > describe the boundaries. Marked sections are really easy to scan for. But once you have changed encodings, do you scan for the end of the marked section using the old or the new encoding? These kinds of ISO 2022 mode changing are what we are trying to get rid of from XML (and from SGML). So you can have multiple encodings before the parser, but not being presented to the parser. The other choice is multiple encodings after the parser: e.g. embedded the SJIS encoded in a latin-1-safe way. This is the same as Dave's comment about transliteration using notation. You can have a document like ]> ... ... (You cannot do the same thing using internal entities in XML, since you cannot put a notatation on an internal entity declaration.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Jan 29 05:31:23 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:01 2004 Subject: SDATA or UNICODE In-Reply-To: <199801290515.QAA30567@jawa.chilli.net.au> Message-ID: On Thu, 29 Jan 1998, Rick Jelliffe wrote: > > No, for the reason Tim points out. On the other hand, you might be on the > > right track. A processing instruction would serve as a hack to tell the > > application where to insert the euro. > > XML has, underlying its decisions, the SGML model which separates the > encoding of data (i.e. "storage management") from their logical representation > as streams of characters in a single character set (i.e. "entity management"). I'm not sure how your observation argues against my proposed hack to insert a non-Unicode character into a Unicode document. This is not an issue of encodings, but of character sets. > If your customers > require multiple encodings, then they have to source each one from a separate > external entity. These entities can be bundled up or interleaved in any > fashion you like, but this is a *PRE* XML storage management issue, not .... > But once you have changed encodings, do you scan for the end of the > marked section using the old or the new encoding? These kinds of ISO 2022 > mode changing are what we are trying to get rid of from XML (and from > SGML). It is exactly *because* the issues do not belong in XML, and are "*PRE* XML" that I advised a preprocessor. I don't see anything that argues against that here. As far as the signalling of mode switches -- it depends on the encodings in question. > So you can have multiple encodings before the parser, but not being presented > to the parser. The other choice is multiple encodings after the parser: e.g. > embedded the SJIS encoded in a latin-1-safe way. This is the same as Dave's > comment about transliteration using notation. You can have a document like > > > [ > > > I-need-decoding NOTATION ( sjis-Qencoded ) > > ]> > > ... > > smdkfjhhjwfnnweofijslkdm > ]]> > ... > You had better hope that CDEnd does not appear in the encoded data! Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 29 09:22:50 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:01 2004 Subject: "Name Spaces in XML" is released Message-ID: <3.0.1.16.19980129084153.4bb7ac34@pop3.demon.co.uk> Forwarded from Dan Connolly of the W3C about namespaces. It's very nice to have this publicly. Note that this is a NOTE; there are formal discussions taking place elsewhere as part of the W3C process. If XML-DEV members have views on the final form of XML-name (or whatever the final syntax is), they should be directed to the appropriate organs; XML-DEV is not an appropriate forum. We should confine ourselves to implementation issues (i.e. "in names-0119 it says X. How do I implement this?" is appropriate ("I have built a freely available prototype" is also good :-), but "I think it would be better if they did Y" is likely to lead to prolonged discussions unrelated to implementation). There are now several publicly announced drafts (XLL, XSL, XML-data, XML-names) all of which are of great interest to members of this list. It would be very useful if any member of appropriate WGs could let us know the timescales, if any, associated with these projects. P. >From: Dan Connolly >Organization: World Wide Web Consortium (http://www.w3.org/) >To: w3c-xml-sig@w3.org >CC: swick@w3.org >Subject: "Name Spaces in XML" is released >Resent-From: w3c-xml-sig@w3.org >X-Mailing-List: archive/latest/2536 >X-Loop: w3c-xml-sig@w3.org >Sender: w3c-xml-sig-request@w3.org >Resent-Sender: w3c-xml-sig-request@w3.org > >We're in the last administrative throws of updating the >http://www.w3.org/TR/ page, but the document >is publicly available. > >Feel free to announce this on xml-dev, comp.text.sgml, and >other public forums. > >Name Spaces in XML > > World Wide Web Consortium Note > 19-January-1998 > >This version: > http://www.w3.org/TR/1998/NOTE-xml-names-0119 > http://www.w3.org/TR/1998/NOTE-xml-names-0119.xml > http://www.w3.org/TR/1998/NOTE-xml-names-0119.html >Latest Version: > http://www.w3.org/TR/1998/NOTE-xml-names > >Editors: > Tim Bray (Textuality and Netscape) > > Dave Hollander (Hewlett-Packard Company) > > Andrew Layman (Microsoft) > > > > >Status of this document > >This document is a NOTE made available by the W3 >Consortium for discussion only. This indicates no >endorsement of its content, nor that the Consortium >has, is, or will be allocating any resources to the >issues addressed by the NOTE. > >This work is part of the W3C XML Activity. > >The XML WG solicits comments from W3C member >companies and W3C working groups that use the >namespace mechanism described in this Note. In >particular, comments on open issues are very >welcome, and should be sent to the editors. > > > >Abstract > >XML Namespaces is a proposal for a simple method >to be used for qualifying names used in Extensible >Markup Language (XML) documents by associating >them with schemas, identified by URI. > >-- >Dan >http://www.w3.org/People/Connolly/ > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu Jan 29 10:40:40 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:00:01 2004 Subject: Conditional actions in XSL? Message-ID: <199801291040.KAA04831@mail.iol.ie> [Tony Stewart] >*(Though an alternative to maintaining traditional memory variables >could be to manipulate attributes of the parsed tree, provided that the >scripts were allowed to get at it... style rules manipulating the DOM... >there's food for thought.) This is kinda-sorta the territory of tree transformation. Conditions that, on the face of it must be in the style langauge such as: if date > 1/1/1999 Can be delegated to an earlier tree transformation that either establishes a TRUE/FALSE attribute on each element or renames the elements so that you can trigger different rules. This is DOM territory since DOM is read/write. As a result this sort of transformation should be doable from whatever scripting environment we access DOM from. Regards, Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jlapp at acm.org Thu Jan 29 11:40:54 1998 From: jlapp at acm.org (Joe Lapp) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data In-Reply-To: <8525659A.00658BD5.00@gwwest.sybase.com> Message-ID: <199801291140.GAA07090@pony-1.mail.digex.net> > I am confused about the intent and overlap of the XML-Data and RDF > initiatives. [...] My understanding is that the overlap is between XML-DATA and the RDF schema effort, rather than the whole of the RDF effort. XML-DATA is a way to express DTDs in XML as well as a way to represent types in general in XML (DTDs are not expressive enough for types in general). The RDF schema effort only needs to be general enough to represent the narrow range of types needed to support the RDF syntax. XML-DATA is more ambitious, and so being, is out of the scope of RDF. The question then becomes: Is there a place in the W3C for something as ambitious as XML-DATA, and if so, what impact does this have on the RDF schema effort (and vice versa)? I think this is one of the first questions that the W3C will need to answer. XML-DATA might spawn a superset of RDF schema, possibly based on RDF schema. I agree that this is a confusing issue, but I'm not inclined to think of this as an affront to RDF. There may be a need for such generality. -- Joe Lapp | Technology Analyst jlapp@webmethods.com | webMethods, Inc. jlapp@acm.org | http://www.webmethods.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Jan 29 12:31:57 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <199801291238.XAA13209@jawa.chilli.net.au> > From: Joe Lapp > The RDF schema effort only needs to be general enough to represent > the narrow range of types needed to support the RDF syntax. XML-DATA > is more ambitious, and so being, is out of the scope of RDF. In another forum the RDF people agreed that they *could* use the standard DTD syntax to markup the information they wanted. However, because they had a particular way they wanted the markup to look, they had to invent some alternative to the XML DTD syntax. (Their justification for this reinventing the wheel seems to be "because we want it this way" rather than any technical reason--I am not saying they are being willful in this or that they really don't like the idea of markup languages; there may well be good reasons that they have not been able to express yet.) RDF does not need a new declaration syntax. They just don't want to be standard. This being the case, XML-data should not use "XML DTDs cannot support the needs of super-hyper-important things like RDF" as a justification for what they are doing. It may be that XML-data declarations can model RDF documents more the way RDF people want it to be done, but that is likely to be the case with every niche DTD: if you invent a new declaration syntax for every niche document type you just minimise how useable your data is by general purpose tools. You lose many of the benefits of having standardized declarations: or worse, you discover that the reasons you disliked the XML DTD syntax are just as present in the new syntax. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rdaniel at lanl.gov Thu Jan 29 16:56:24 1998 From: rdaniel at lanl.gov (Ron Daniel Jr.) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <3.0.32.19980129094913.00ab8af0@cic-mail.lanl.gov> At 11:31 PM 1/29/98 +1100, Rick Jelliffe wrote: >In another forum the RDF people agreed that they *could* use the standard >DTD syntax to markup the information they wanted. However, because they >had a particular way they wanted the markup to look, they had to invent some >alternative to the XML DTD syntax. I'm sorry, but that is not correct. The RDF Schema group decided that they would specify the schema system using the nodes and arcs model from the RDF Model and Syntax group. This is *not* because we wanted the markup to look a particular way. It is because: 1) The node and arc model provides the precision for the typing system to build on 2) Defining the typing system in terms of the model lets alternative typing systems be developed should sophisticated applications wish to do so. 3) Types defined in such a way smoothly integrate with the models that use the types (its just more nodes and arcs). By defining the typing system in terms of the underlying model, we are independent of the syntax, and can come up with ways of mapping from almost any declaration syntax (DTDs or XML-Data schemas) and an XML instance to the nodes and arcs that are the real RDF representation of that instance. Just how we do this is something I am starting to look at now (as a personal effort, not an RDF group effort). The current RDF syntax is just a first, easy, way of going between XML and the RDF model. We want to have 'better' ways as well, but other things have to come first. >(Their justification for this reinventing >the wheel seems to be "because we want it this way" rather than any technical >reason--I am not saying they are being willful in this or that they really >don't like the idea of markup languages; there may well be >good reasons that they have not been able to express yet.) I hope that the reasons above provide the technical justification you are seeking. If the typing system were not defined in terms of the model, we would have a real mess. >RDF does not need a new declaration syntax. They just don't want to be >standard. Oh, please. >This being the case, Its not the case, so any arguments that depend on it being the case need other justifications. > XML-data should not use "XML DTDs cannot support the >needs of super-hyper-important things like RDF" as a justification for >what they are doing. I have not heard them use such a justification. Were they to do so, they might lock themselves into a stronger tie with RDF than they appear to want. Ron Daniel Jr. voice:+1 505 665 0597 Advanced Computing Lab fax:+1 505 665 4939 MS B287 email:rdaniel@lanl.gov Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel Los Alamos, NM, USA, 87545 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 29 16:57:42 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <3.0.32.19980129085634.00a9d510@pop.intergate.bc.ca> At 11:31 PM 29/01/98 +1100, Rick Jelliffe wrote: >In another forum the RDF people agreed that they *could* use the standard >DTD syntax to markup the information they wanted. This statement is incorrect. In fact, RDF, like many other SGML applications, has syntactic restrictions on what you can do that go well beyond what's expressible in DTDs. The RDFers feel (rightly) that what they want to achieve ought to be within the reach of a declarative schema facility, and are concerned that none such exists. But I'm pretty sure that's not why XML-Data exists. >RDF does not need a new declaration syntax. They just don't want to be >standard. Inflammatory statements like this are defensible only when they're accurate. This one is not. >This being the case, XML-data should not use "XML DTDs cannot support the >needs of super-hyper-important things like RDF" as a justification for >what they are doing. It is just as easy to conclude from the evidence that XML-Data is in direct competition with RDF and RDF-Schema. I honestly don't know what the understanding of the authors of XML-Data is as regards things like RDF, and so far, nobody has said. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Jan 29 18:22:51 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:01 2004 Subject: While on the subject of W3C submissions Message-ID: <3.0.32.19980129102253.00a19950@pop.intergate.bc.ca> Check out http://www.w3.org/TR/1998/NOTE-HTMLThreading-0105 Now there's a cool XML application, and one that'll touch us all, if y'ask me. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dgd at cs.bu.edu Thu Jan 29 18:25:47 1998 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <199801291818.NAA02883@csb.bu.edu> I agree with much of what Eliot said, most particularly that XML validation must continue to be the focus of XML. Replacement syntax and semantics for DTDs should be compiled to DTDs, and/or DTDs plus _additional_ validation processes. I also have seen no effective refutation of Eliot and Rick's analysis of how semantic association can be done _within_ XML as it stands. On the other hand, it does no good to overplay your cards. Eliot's assertion that extended DTD syntaxes will _never_ solve the semantics problems is over-stated in two ways. First, it's not true if we're talking about solveing single, limited problems. For instance, it's clearly true that solving _some_ semantics problems would be possible by adding to the power of DTDs. For instance, the addition of regexp mapping on content would _increase_ the range of validatable conditions on documents, and might even enable some applications to avoid referring to extended semantics. On the other hand, each such single extension solves only a single problem. Extensibility is still likely to be a requirement. Even this limitation doesn't mean that we couldn't solve the schema problem for once and for all. A Turing complete "semantics testing" language would allow the verification of all computable properties of XML documents. There are some serious disadvantages to this as an apporach: Most interesting properties of arbitrary programs are uncomputable. Unlike the kinds of declarative conditions often mentioned for DTD extensions, arbitrary computations can be intractable. Further, it is only rarely that we need the ability to specify validity conditions like "this document is valid if and only if Goldbach's conjecture is true." I think reactions to proposals for schema languages would be more generally positive if they concentrated on how to supplement DTDs and specify things that are impossible in XML (like content restrictions) or extremely difficult (like modular DTDs) rather than also duplicating DTD functionality. The hard thing is deciding what semantic features will be widely useful enough that failing back to arbitrary computations will be unnecessary, or at least extremely rare. Making it clear that such languages are supplements to DTD specification techniques also removes the (valid) criticism that they are duplicating facilities already in DTDs, and the political suspicion that they are an attempt to make an end-run around a delcaration syntax that is accepted and standardized, if not well beloved of all. -- David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Jan 29 18:57:06 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <3.0.32.19980129124603.00b7dda4@swbell.net> At 01:18 PM 1/29/98 -0500, David Durand wrote: > On the other hand, it does no good to overplay your cards. Eliot's >assertion that extended DTD syntaxes will _never_ solve the semantics >problems is over-stated in two ways. Good point. I didn't mean to imply that the problem can't be solved within defined constraints, only that it is, in the general case, unsolvable *by a single specification*. Certainly within defined scopes we can and will do all sorts of very useful things, and the XML Data spec is probably a good example of that. I mostly don't want people to underestimate the scope and implications of the problems. There are also non-obvious pitfalls, such as the EDD trap. There are also a wide variety of potential candidates as methodologies and off-the-shelf technologies. It will not be possible to sort these out and arrive any sort of wide-scope consensus in a short period of time. Trying to define a single schema specification language that satisfies all the requirements looks to me like the document analysis project from Hell. That said, I think people should start hammering out schema mechanisms as fast as they can--the more ideas the better. But I think it would be irresponsible for the W3C or any other standardizing body to embrace any particular proposal too quickly. If, for example, XML Data is a good idea, let it prove it in the marketplace before trying to give it the W3C seal of approval. That's all I'm asking for. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Thu Jan 29 19:07:22 1998 From: Kenneth.J.Meltsner at jci.com (Meltsner, Kenneth J) Date: Mon Jun 7 17:00:01 2004 Subject: Conditional actions in XSL? Message-ID: <8625659B.004E28C6.00@Corpnotes.JCI.Com> My biggest concern with using *any* language with XSL would be the potential for odd things when language functions have side-effects. If a function or subroutine sets the state of an external variable, then it's possible that external operation might occur *every* time an element is redisplayed. I believe the spec (rightly) states that side-effects are not allowed and that XSL's behavior (and the state of the variables involved in the side-effecting function) will be undefined if they are used. [I suspect this is old info for those familiar with the history of DSSSL -- the choice of a language that supports functional (no side effect) programming language such as Scheme indicates that the spec developers were quite aware of the potential dangers of more conventional programming styles.] Reading external variables can be another problem since you have to hand-craft some sort of polling or event handling if you expect the external state might change during a document's use. It's too bad this is a tough problem with the current XSL (no value dependency information is maintained, I suspect, in any existing or planned implementation), since you can provide neat capabilities if you keep track of the information each layout variable depends upon. This "constraint-based" approach make it possible to support selective redisplays (depending on the changes that invalidate previous calculations) and dynamic displays that depend on information external to the formatter. For more on the difference between the usual approach (Random question: does DSSSL have an event model?) and the more powerful constraint-based approach, you might want to look at Prof. Ethan Munson's Proteus papers. (http://www.cs.uwm.edu/faculty/munson/) He describes an alternative to DSSSL that uses constraints to handle out-of-order layout, dynamic behavior, etc. The biggest drawback are the memory requirements for maintaining the dependency information in large documents. A less important concern would be the proliferation of scripting languages, which is a real annoyance from a general service developer's point of view. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Jan 29 20:36:19 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:01 2004 Subject: While on the subject of W3C submissions Message-ID: <009001bd2cf4$ea8a3390$2ee044c6@donpark> >Check out > > http://www.w3.org/TR/1998/NOTE-HTMLThreading-0105 > >Now there's a cool XML application, and one that'll touch us all, >if y'ask me. -Tim I agree. I think we need a similar paper addressing generic data threading issues. I wonder if it is better to work on the problem of using XML to solve data threading or the problem of threading XML documents? Don Park http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Jan 29 21:09:05 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data In-Reply-To: <3.0.32.19980129085634.00a9d510@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980129194253.08376e64@pop3.demon.co.uk> At 08:58 29/01/98 -0800, Tim Bray wrote: [... snip ...] >Inflammatory statements like this are defensible only when they're >accurate. This one is not. I think we'd like to avoid *inflammatory* statements on XML-DEV under any circumstances :-) As I've mentioned, there are a number of important and complex proposals in public view and I expect that many of the members of the list have strong feelings about them. They all have their proper process for discussion, and there are ways for anyone to get their views forwarded. Implementation and clarification are, of course, always welcome here :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jlapp at webmethods.com Thu Jan 29 22:01:24 1998 From: jlapp at webmethods.com (Joe Lapp) Date: Mon Jun 7 17:00:01 2004 Subject: Extending DTDs for Backward Compatibility Message-ID: <3.0.32.19980129170451.00d6f100@tni.webmethods.com> I need to create DTDs that are extensible. An application knowing the semantics behind an old DTD may be handed an XML document written with an extended DTD. The old application should be able to function with the extended document by ignoring the extensions. I need advice on the best way to go about creating the initial DTD (and subsequent DTDs, should their designs also be a factor). There are different ways in which I might want to extend the initial DTD. One way would be to add attributes or change default attribute values. This seems easy to accomplish because XML allows one element to have multiple attribute declarations, where attribute declarations are merged. However, it is not clear to me how to best accomodate new element types that might be added. Here are the approaches that I have so far envisioned: (1) Place into the initial DTD an element called EXTENSIONS and give it a contentspec of ANY (is "contentspec" the right term?). Subsequent DTDs would redefine EXTENSIONS to contain their own extensions, but each of these DTDs would define still another EXTENSIONS element within the overridden EXTENSIONS element. For some reason this approach doesn't sit well with me. It seems that I would need to put an EXTENSIONS element inside *every* element of the initial DTD if I think I might ever have to extend that element. Ugly!!!! (2) In the initial DTD, define an element to have a contentspec of ANY if I think it might ever need to be extended. Now the application that eats documents of this DTD would have to be a bit smarter, because it would have to know that it is valid to ignore elements that it does not recognize. It also moves a lot more of the document validation into the application's lap, because we aren't taking advantage of a DTD's ability to constrain document content. Ugly, again!!!! (3) Design the initial DTD to be exactly the way I want it, and design subsequent DTDs to be exactly the way I want them, except that any document conforming to the second DTD must have a subtree (in the element hierarchy) that conforms to the first DTD. The second DTD need not be an extension of the first DTD in the SGML sense of the word "extension." Instead, the ability to handle extended documents is completely the responsibility of the applications. Applications must throw out subtrees that they do not recognize. But then I have no way to leverage my initial DTD in the design of my subsequent DTD, except to copy the initial DTD and to tweak the copy. When should I be using DTD inheritance? Is it unreasonable to impose this kind of behavior on my applications? Is this ugly too? Well, those are the only options I've come up with so far for allowing old DTDs to be extended by adding new elements. What are your opinions of these approaches? What options have I missed? What other kinds of extensions might I want to plan for, and what is the best way to plan for them? Anyone know where can I find a suitably knowledgable Buddha in the high Himalayas? Thanks el mucho for whatever help you can provide. (BTW, I tried to find the relevant information by searching the xml-dev archive, but I have trouble using simple search engines to find information. Normally I use AltaVista's advanced search form, but AltaVista has not walked xml-dev for eons. Is there any way to invite AltaVista for a walk accross our grounds?) -- Joe Lapp, Technology Analyst | jlapp@webmethods.com webMethods, Inc. | Voice: 703-267-1726 http://www.webmethods.com | Fax: 703-352-0370 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dima at paragraph.com Thu Jan 29 22:04:06 1998 From: dima at paragraph.com (Dmitri Kondratiev) Date: Mon Jun 7 17:00:01 2004 Subject: Resource Description Format and XML-Data Message-ID: <2.2.32.19980129220206.006f51f0@dream.paragraph.com> At 19:42 29.01.98, Peter Murray-Rust wrote: [...] > >As I've mentioned, there are a number of important and complex proposals in >public view and I expect that many of the members of the list have strong >feelings about them. They all have their proper process for discussion, and >there are ways for anyone to get their views forwarded. Implementation and >clarification are, of course, always welcome here :-) > Does anybody know the *right* mail lists dedicated specifically to SGML/XML meta object frameworks and RDF/XML-Data in particular ? I couldn't find any public list of this sort on . Thanks, Dima ----------------- Dmitri Kondratiev dima@paragraph.com 102401.2457@compuserve.com http://www.geocities.com/SiliconValley/Lakes/3767/ tel: 07-095-464-9241 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ser at javalab.uoregon.edu Thu Jan 29 23:00:27 1998 From: ser at javalab.uoregon.edu (Sean Russell) Date: Mon Jun 7 17:00:01 2004 Subject: While on the subject of W3C submissions In-Reply-To: <009001bd2cf4$ea8a3390$2ee044c6@donpark> Message-ID: <199801292259.OAA23659@jersey.uoregon.edu> [Don Park, on 29 Jan] > [Tim Bray] > > http://www.w3.org/TR/1998/NOTE-HTMLThreading-0105 > > > >Now there's a cool XML application, and one that'll touch us all, > >if y'ask me. -Tim > > I agree. I think we need a similar paper addressing generic data threading > issues. I wonder if it is better to work on the problem of using XML to > solve data threading or the problem of threading XML documents? I'm of two minds about this. This kind of markup is useful for document sharing and revisions, and I can see an application for it in email. At the risk of being considered a throw-back, I'm concerned by the amount of bulk this kind of markup is going to add to your average email. Plain HTML adds a small amount of bulk, but often doesn't increase the average email size beyond 50%. I can see that HTMLThreading would often more than double the size of emails. -- |.. --------------------- Sean Russell ---------------------- <|> ser@javalab.uoregon.edu <-> http://jersey.uoregon.edu/ser /|\ ------- [ Software Engineer ] -------- /| [ PGP info available from my web site ] -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 239 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980129/0f1b82ec/attachment.bin From terje at in-progress.com Thu Jan 29 23:54:15 1998 From: terje at in-progress.com (terje@in-progress.com) Date: Mon Jun 7 17:00:01 2004 Subject: XPublish 1.0 beta (Mac XML Website Publishing) Message-ID: A new beta version of XPublish is now available for download from: http://interaction.in-progress.com/xpublish XPublish is an XML based website publishing system for Macintosh. It allows a website to be efficiently developed and maintained in XML, then published as HTML for access by standard web browsers. The application was featured in last week's issue of MacWeek. The XPublish distribution comes with a tutorial that cover its basic features. Feedback about XPublish is highly appreciated. Send me an email if you want to join as official beta tester. -- Terje | Media Design in*Progress C a s c a d e... a comprehensive Cascading Style Sheets editor for Mac XPublish - for efficient website publishing with XML Make your Web Site a Social Place with Interaction! Check out our web tools at xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 30 00:50:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:02 2004 Subject: Solution: Extending DTDs for Backward Compatibility In-Reply-To: <3.0.32.19980129170451.00d6f100@tni.webmethods.com> References: <3.0.32.19980129170451.00d6f100@tni.webmethods.com> Message-ID: <199801300045.TAA00333@unready.microstar.com> Joe Lapp writes: > I need to create DTDs that are extensible. An application knowing the > semantics behind an old DTD may be handed an XML document written with an > extended DTD. The old application should be able to function with the > extended document by ignoring the extensions. I need advice on the best > way to go about creating the initial DTD (and subsequent DTDs, should their > designs also be a factor). This falls solidly into the domain of problems for architectural forms: just use the previous version of the DTD as the base architecture, and derive any new element types from something that's already there. For example, if you have a DTD with the root element "doc", and you are creating an extended version by adding an element type named "foreign", you can derive that from the most similar element type in the old version of the DTD (say, "emphasis"). Here's how you do it: Piece o'cake. I've got tons of examples like this in my forthcoming book Structuring XML Documents, from Prentice-Hall: it should be hitting the bookstores at the end of March. There's even a whole chapter on backwards-compatibility for DTDs. Buy, BUY, BUY! All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Jan 30 03:40:37 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:00:02 2004 Subject: Extending DTDs for Backward Compatibility References: <3.0.32.19980129170451.00d6f100@tni.webmethods.com> Message-ID: <34D14B94.3869A09E@allette.com.au> Joe Lapp wrote: > I need to create DTDs that are extensible. An application knowing the > semantics behind an old DTD may be handed an XML document written with an > extended DTD. The old application should be able to function with the extended > document by ignoring the extensions. I need advice on the best way to go about > creating the initial DTD (and subsequent DTDs, should their designs also be a > factor). > > There are different ways in which I might want to extend the initial DTD. One > way would be to add attributes or change default attribute values. This seems > easy to accomplish because XML allows one element to have multiple attribute > declarations, where attribute declarations are merged. However, it is not clear > to me how to best accomodate new element types that might be added. When we designed the CALS initiative in Australia, we created a core DTD and about 27 satellite DTDs. The core was riddled with parameter entities - in fact everything that you may want to remap used an entity. Variant DTDs were nothing more than parameter entities mapping out the differences between the core structure and the desired structure, then a parameter entity calling in the core file. Because the first parameter entity is regarded and any subsequent identically named entities are ignored, the remapped value stuck. For distribution, we wrote a simple program to expand out most of the entities, making the DTD readable as well as producing an extensively linked electronic format. This approach worked well - changes could be considered as local or global. If they were global, change the core, otherwise change the satellite. It ensured that elements meant fundamentally the same thing, but allowed the structural flexibility that disparate branches of the military require. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Jan 30 06:52:09 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:02 2004 Subject: Resource Description Format and XML-Data Message-ID: <199801300700.SAA01449@jawa.chilli.net.au> > From: Tim Bray > At 11:31 PM 29/01/98 +1100, Rick Jelliffe wrote: > >In another forum the RDF people agreed that they *could* use the standard > >DTD syntax to markup the information they wanted. > > This statement is incorrect. For my evidence I refer to Ramanathan Guha (Netscape rep on RDF WG) http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1997Dec/0040.html and Ralph Swick (W3C rep on RDF WG) http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1997Dec/0040.html These two are the editors of the RDF parts. Their comments are unequivocable. I am not allowed to quote the drafts in public, under the archive rules from W3C. However, I can quote what I wrote which prompted their acknowledgement: http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1997Dec/0039.html # As far as I could find so far, the only difference from # draft RDF that is required to allow me to use a standard # XML markup declaration is this: to shuffle the element type # name of property elements into an attribute (here "RDF:Name"), # and to use "RDF:Property" as the element type name instead. > Inflammatory statements like this are defensible only when they're > accurate. This one is not. See above. This is the third time we have had this discussion now Tim. Let me say it again. RDF can use standard XML declarations if they want to. They don't want to. That is their choice, and I wish them well. People should not get the idea that RDF's choice is evidence supporting the necessity of things like XML-data (to the extent that this may occur). > It is just as easy to conclude from the evidence that XML-Data is in > direct competition with RDF and RDF-Schema. I honestly don't know what > the understanding of the authors of XML-Data is as regards things like > RDF, and so far, nobody has said. -T. Yes, this is a good point. But XML-data goes a lot further than RDF, both for better and worse, IMHO. Please don't get me wrong that I don't support XML-data and RDF 100% for what they say they are trying to do. But I think the implementation trade-offs that in them are skewed showing a bias against SGML declarations which I think is wrong-headed. In fact, the use of a regular expression syntax in content models is the single best abstraction in SGML: to replace it with a million elements somehwere in an instance, as XML-data currently does, seems to me to be getting rid of the only thing that programmers have an education to be comfortable with immediately! An idea verging on the bizarre. XML-DEV is a forum where developers get their understanding of what the markup community feels are the strengths and weaknesses of XML. And also to gauge which way the wind is blowing for technical strategy. This is why I have posted here; developers should be aware that many people have opinions which are repeatedly given but which there is little evidence for when examined. I am sure I am often in this category. Personally, I believe that intuitions are highly credible when given by a person of great expertise, especially for these complex matters where it may be that the issues can never be articulated convincingly. Tim is such a person of great expertise. The reason why RDF cannot use standard DTDs has not been articulated convincingly. But I think we should take his intuition in this very seriously, and discuss it with the seriousness such broad and confusing charges against the standard declarations warrant. If I am being inflamatory, I apologize to all to the extent I may have unintentionally seemed rude. But, since I was not wrong, I cannot apologize for that, much as I'd like to. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From trevort at za.ibm.com Fri Jan 30 07:50:46 1998 From: trevort at za.ibm.com (Trevor Turton) Date: Mon Jun 7 17:00:02 2004 Subject: Imbedded elements Message-ID: <34D1E921.6B73@za.ibm.com> Is it legal (and surely it must be) to imbed one element within another? To take a common example from current HTML practice, to imbed a element within a element so that the graphical layout of the tags can be controlled? And if so, how does XML resolve a given tag if it occurs within element A which is imbedded within element B, and the particular tag happens to be defined and valid in both element A and element B? Unlikely today when all HTML elements and tags are controlled by a single standards body (well, almost). Inevitable in the future as independent organisations develop and distribute their own DTDs. Trevor Turton xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From carlo.garcia at mail.titn.alcatel.fr Fri Jan 30 10:37:39 1998 From: carlo.garcia at mail.titn.alcatel.fr (GARCIA) Date: Mon Jun 7 17:00:02 2004 Subject: unsubscribe Message-ID: <34D1AD4D.2011@titn.alcatel.fr> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at light.demon.co.uk Fri Jan 30 11:36:47 1998 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 17:00:02 2004 Subject: Imbedded elements In-Reply-To: <34D1E921.6B73@za.ibm.com> Message-ID: In message <34D1E921.6B73@za.ibm.com>, Trevor Turton writes >Is it legal (and surely it must be) to imbed one element within >another? To take a common example from current HTML practice, to imbed >a
element within a element so that the graphical layout >of the tags can be controlled? > >And if so, how does XML resolve a given tag if it occurs within element >A which is imbedded within element B, and the particular tag happens to >be defined and valid in both element A and element B? Unlikely today >when all HTML elements and tags are controlled by a single standards >body (well, almost). Inevitable in the future as independent >organisations develop and distribute their own DTDs. It's no problem. Because XML is well-formed, and doesn't allow any tags to be omitted, the context of every element is crystal clear. If an XML processor has encountered followed by , then it knows it is inside . If an then comes along, the is 'inside' , which in turn, of course, is 'inside' . The XML processor must subsequently get an , a and an in that order to 'back out' of the hierarchy it has created. This is only an issue in HTML because (a) HTML adopted some fairly sophisticated tag omission rules from SGML; and (b) every HTML parser proceeded to ignore this and treat tags as point markers rather than nodes in a well-defined tree structure. Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fasthand at bigfoot.com Fri Jan 30 14:33:13 1998 From: fasthand at bigfoot.com (fasthand@bigfoot.com) Date: Mon Jun 7 17:00:02 2004 Subject: Announcement: DTD generator/editor (free) In-Reply-To: <199801291040.KAA04831@mail.iol.ie> Message-ID: Hi all, I like to share ezDTD with you. ezDTD is a tool for editing DTD for WIN 95 and NT. I use ezDTD to help me do things like: 1. Jump from one element to another very quick. 2. Complete the typing by filling things like ANY, EMPTY, #IMPLIED..etc. 3. Finally, besides generate the DTD as a text file, It also can generate the DTD file in HTML format with internal link between elements, converting from "<" to "<",..etc. You can find it at http://www.geocities.com/SiliconValley/Haven/2638/ezDTD.htm Best Regards, Duncan ___________________________________ Duncan Chen fasthand@bigfoot.com FNC, Inc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Jan 30 14:34:53 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:02 2004 Subject: Extending DTDs for Backward Compatibility Message-ID: <01bd2d8c$4eaeecc0$1e09e391@mhklaptop.bra01.icl.co.uk> -----Original Message----- From: Joe Lapp To: xml-dev@ic.ac.uk Date: 29 January 1998 22:07 Subject: Extending DTDs for Backward Compatibility >I need to create DTDs that are extensible... I've got the same problem: I essentially want a "standard" DTD for an application domain that tolerates private extensions added by individual implementors. I had thought of doing it with something like XXX; this has the advantage it flags it up as a private extension, but it's certainly ugly in other ways. Fortunately I'm not using attributes extensively, it's only private elements I'm really concerned with. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Jan 30 14:58:34 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:02 2004 Subject: Extending DTDs for Backward Compatibility Message-ID: <199801301507.CAA10391@jawa.chilli.net.au> > From: Joe Lapp > >I need to create DTDs that are extensible... The base DTD: ... In the document: ]> some text some other text Parameter entities in the "internal subset" of the Doctype declaration are parsed before the "eternal subset" (i.e. x.dtd here). The first declaration of an entity has precedence. So you can derive a new DTD by merely declaring the new element types you need in the internal subset, and then defining the appropriate parameter entity. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Jan 30 15:04:50 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <01bd2d90$7dc5d6a0$1e09e391@mhklaptop.bra01.icl.co.uk> I've downloaded MSXSL and used it to generate HTML for a couple of document types, successfully but with a certain amount of frustration caused by (a) lack of diagnostics when I got things wrong, and (b) limited functionality. I've now implemented the same thing without XSL: I wrote an MSXML application in Java that does a recursive walk down the document tree and calls a registered "handler" class to process each element type. I added a number of helper methods such as isFirstOfType() to allow the handlers to get information about their context more easily. Here is an example of one of the handlers (for the XML element tagged SPEECH): class SPEECHHandler extends HTMLNodeWriter { public void handleElement(ElemNode e) { if (e.isFirstOfType()) System.out.println("
"); e.walkChildren(); System.out.println("
"); if (e.isLastOfType()) System.out.println("
"); } } I have to report: - the element handlers looked very similar to the XSL rules - the number of DTD-specific lines of code was identical (106 in each case!) - it was far easier to debug - you can do very many things that you can't do in XSL, like sorting the children of a node according to some attribute value, or getting information about user preferences from an external database. I have yet to spot any disadvantages. I haven't looked at performance or footprint, but I can't see any intrinsic reason why XSL should be smaller or faster. (Currently some of the methods like isLastOfType() are very inefficient due to the limited navigation capabilities in MSXML. I could speed it up if I built my own tree!). Any XSL enthusiasts want to prove me wrong? Regards, Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 30 15:17:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:02 2004 Subject: Extending DTDs for Backward Compatibility In-Reply-To: <199801301507.CAA10391@jawa.chilli.net.au> References: <199801301507.CAA10391@jawa.chilli.net.au> Message-ID: <199801301512.KAA00826@unready.microstar.com> Rick Jelliffe writes: > The first declaration of an entity has precedence. > > So you can derive a new DTD by merely declaring the new > element types you need in the internal subset, and then > defining the appropriate parameter entity. Absolutely right, but this doesn't solve his other problem -- that processing software written for the base version of the DTD has to be able to deal with documents written for the extended version. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Jan 30 15:22:43 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL In-Reply-To: <01bd2d90$7dc5d6a0$1e09e391@mhklaptop.bra01.icl.co.uk> References: <01bd2d90$7dc5d6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199801301518.KAA00845@unready.microstar.com> Michael Kay writes: > I have yet to spot any disadvantages. I haven't looked at > performance or footprint, but I can't see any intrinsic reason why > XSL should be smaller or faster. (Currently some of the methods > like isLastOfType() are very inefficient due to the limited > navigation capabilities in MSXML. I could speed it up if I built my > own tree!). If you're working in Java, instead of building your own tree you might want to take a look at Don Park's SAXDOM: http://www.quake.net/~donpark/saxdom.html Neither the DOM nor SAX is finalised yet (though SAX is closer), and I haven't had a chance to test Don's package yet, but there are still a couple advantages to avoiding a proprietary tree structure: 1) You will have to make fewer changes later on. 2) Since Don's package uses SAX, your code will work with any supported parser rather than just Microsoft's (currently, there are SAX drivers for NXP, Lark, AElfred, XP, _and_ MSXML); eventually, it will work for any parser with DOM support as well. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sca at eps.inso.com Fri Jan 30 15:52:22 1998 From: sca at eps.inso.com (Sharon Adler) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <2.2.32.19980130155416.0085e27c@pop> Michael, As I write this, the XSL WG is 2/3 through its first official meeting. The Microsoft code does not represent the "Final" XSL but the srawman of some of the facilities of XSL. The lack of diagnostics/limited functionality of a partial prototype implementation is not any indication of the functionality or capability of a style language, nor any final implementation. Of course you can accomplish what you wanted in Java. Any hacker can do anything they want in code, but what about the rest of the world's humans. Please don't use the XSL prototype if it is not suitable for you to play around with, but give us a chance to create a workable standard. Thank you. Sharon Adler Co-chair, XSL WG At 03:05 PM 1/30/98 -0000, Michael Kay wrote: >I've downloaded MSXSL and used it to generate HTML for a couple of document >types, successfully but with a certain amount of frustration caused by (a) >lack of diagnostics when I got things wrong, and (b) limited functionality. > >I've now implemented the same thing without XSL: I wrote an MSXML >application in Java that does a recursive walk down the document tree and >calls a registered "handler" class to process each element type. I added a >number of helper methods such as isFirstOfType() to allow the handlers to >get information about their context more easily. > >Here is an example of one of the handlers (for the XML element tagged >SPEECH): > >class SPEECHHandler extends HTMLNodeWriter { > public void handleElement(ElemNode e) { > if (e.isFirstOfType()) > System.out.println("
"); > e.walkChildren(); > System.out.println("
"); > if (e.isLastOfType()) > System.out.println("
"); > } >} > >I have to report: >- the element handlers looked very similar to the XSL rules >- the number of DTD-specific lines of code was identical (106 in each case!) >- it was far easier to debug >- you can do very many things that you can't do in XSL, like sorting the >children of a node according to some attribute value, or getting information >about user preferences from an external database. > >I have yet to spot any disadvantages. I haven't looked at performance or >footprint, but I can't see any intrinsic reason why XSL should be smaller or >faster. (Currently some of the methods like isLastOfType() are very >inefficient due to the limited navigation capabilities in MSXML. I could >speed it up if I built my own tree!). > >Any XSL enthusiasts want to prove me wrong? > >Regards, Mike Kay > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bwb at concentra.com Fri Jan 30 16:03:02 1998 From: bwb at concentra.com (Brent Benson) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <01bd2d98$3eecb0e0$385110ac@lunenburg.concentra.com> >I've downloaded MSXSL and used it to generate HTML for a couple of document >types, successfully but with a certain amount of frustration caused by (a) >lack of diagnostics when I got things wrong, and (b) limited functionality. > >I've now implemented the same thing without XSL: I wrote an MSXML >application in Java that does a recursive walk down the document tree and >calls a registered "handler" class to process each element type [...] >I have to report: >- the element handlers looked very similar to the XSL rules >- the number of DTD-specific lines of code was identical (106 in each case!) >- it was far easier to debug >- you can do very many things that you can't do in XSL, like sorting the >children of a node according to some attribute value, or getting information >about user preferences from an external database. > >I have yet to spot any disadvantages [...] Of course, writing a program to solve the problem is more flexible than the declarative approach of writing XSL rules. This additional flexibility, though, comes at the cost of making it less accessible to non-programmers and not fitting as well into a generalized framework of declarative document description languages. Many of your complaints about msxsl seem to be with the tool, rather than the language itself. There is no reason why such a tool should not be able to give good diagnostics. I'm a real fan of domain-specific, declarative languages like XSL, but I haven't done enough XSL rule writing to see if the designers got it right. -Brent xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From istvanc at microsoft.com Fri Jan 30 16:45:07 1998 From: istvanc at microsoft.com (Istvan Cseri) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <5BF896CAFE8DD111812400805F1991F7222012@red-msg-08.dns.microsoft.com> Of course it is easy for any programmer to write an XML parser, implement an XML object model, implement the XSL algorithm, print out HTML, etc. This is actually great, it shows that the design of these languages and algorithms are simple enough ! The goal is of course to create something reusable, instead of everybody writing their own favorite and incompatible processors. An other goal is to let non-programmers take advantage of a style sheet language. An other goal is to keep the language as declarative as possible so if you want to effeciently re-generate PART of the output again because the source XML document or XSL style sheet changed you can do this. No argument here that the current XSL processor is limited. We are working on it (as I assume a couple of other companies). Does this answer your question ? Istvan > -----Original Message----- > From: Sharon Adler [SMTP:sca@eps.inso.com] > Sent: Friday, January 30, 1998 7:54 AM > To: Michael Kay; xml-dev@ic.ac.uk > Subject: Re: First experiences with XSL > > Michael, > > As I write this, the XSL WG is 2/3 through its first official meeting. > The > Microsoft code does not represent the "Final" XSL but the srawman of some > of > the facilities of XSL. The lack of diagnostics/limited functionality of a > partial prototype implementation is not any indication of the > functionality > or capability of a style language, nor any final implementation. Of course > you can accomplish what you wanted in Java. Any hacker can do anything > they > want in code, but what about the rest of the world's humans. > > Please don't use the XSL prototype if it is not suitable for you to play > around with, but give us a chance to create a workable standard. > > Thank you. > > Sharon Adler > Co-chair, XSL WG > > > > > At 03:05 PM 1/30/98 -0000, Michael Kay wrote: > >I've downloaded MSXSL and used it to generate HTML for a couple of > document > >types, successfully but with a certain amount of frustration caused by > (a) > >lack of diagnostics when I got things wrong, and (b) limited > functionality. > > > >I've now implemented the same thing without XSL: I wrote an MSXML > >application in Java that does a recursive walk down the document tree and > >calls a registered "handler" class to process each element type. I added > a > >number of helper methods such as isFirstOfType() to allow the handlers to > >get information about their context more easily. > > > >Here is an example of one of the handlers (for the XML element tagged > >SPEECH): > > > >class SPEECHHandler extends HTMLNodeWriter { > > public void handleElement(ElemNode e) { > > if (e.isFirstOfType()) > > System.out.println("
"); > > e.walkChildren(); > > System.out.println("
"); > > if (e.isLastOfType()) > > System.out.println("
"); > > } > >} > > > >I have to report: > >- the element handlers looked very similar to the XSL rules > >- the number of DTD-specific lines of code was identical (106 in each > case!) > >- it was far easier to debug > >- you can do very many things that you can't do in XSL, like sorting the > >children of a node according to some attribute value, or getting > information > >about user preferences from an external database. > > > >I have yet to spot any disadvantages. I haven't looked at performance or > >footprint, but I can't see any intrinsic reason why XSL should be smaller > or > >faster. (Currently some of the methods like isLastOfType() are very > >inefficient due to the limited navigation capabilities in MSXML. I could > >speed it up if I built my own tree!). > > > >Any XSL enthusiasts want to prove me wrong? > > > >Regards, Mike Kay > > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >(un)subscribe xml-dev > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > >subscribe xml-dev-digest > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Jan 30 17:24:56 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <01bd2da4$0e1fe660$1e09e391@mhklaptop.bra01.icl.co.uk> >The goal is of course to create something reusable, instead of everybody >writing their own favorite and incompatible processors. An other goal is to >let non-programmers take advantage of a style sheet language. An other goal >is to keep the language as declarative as possible so if you want to >effeciently re-generate PART of the output again because the source XML >document or XSL style sheet changed you can do this. > >No argument here that the current XSL processor is limited. We are working >on it (as I assume a couple of other companies). Does this answer your >question ? > My posting wasn't intended to be negative, merely to report early experience to provide feedback, which is surely necessary if these goals are to be accomplished. My expectation was that XSL would be rather like a report writer: much quicker than programming to achieve simple tasks, but limited in capability. Report writers have always had this "brick wall" problem: when a simple report gets more and more complex, you have to switch technology and start again. But my experience, which I wanted to report, was that I could define a general and reusable framework in which programming simple reports in Java was just as easy as programming them in XSL, without the brick wall problem. If XSL can be turned into a tool for non-programmers, then it will certainly serve a useful purpose. I don't think it's there yet, but I wish the endeavour well. regards, Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davidsch at microsoft.com Fri Jan 30 17:26:48 1998 From: davidsch at microsoft.com (David Schach) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <5CEA8663F24DD111A96100805FFE658703736940@red-msg-51.dns.microsoft.com> > I've downloaded MSXSL and used it to generate HTML for a couple of > document > types, successfully but with a certain amount of frustration caused by (a) > lack of diagnostics when I got things wrong, > Agreed. The diagnostics need to be improved. Remember this is a prototype. The best way to debug right now is to use the command line tool, msxsl.exe, and use the println function. > and (b) limited functionality. > Agreed. The functionality needs to be improved. msxsl doesn't implement the full XSL specification (which is far from being complete or final). > I've now implemented the same thing without XSL: I wrote an MSXML > application in Java that does a recursive walk down the document tree and > calls a registered "handler" class to process each element type. I added a > number of helper methods such as isFirstOfType() to allow the handlers to > get information about their context more easily. > Agreed. You've essentially reinvented and reimplemented an alternative XSL. The purpose of XSL is to be a declarative XML based syntax that is easy to use. Writing JAVA code doesn't meet these requirements. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From istvanc at microsoft.com Fri Jan 30 17:57:55 1998 From: istvanc at microsoft.com (Istvan Cseri) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <5BF896CAFE8DD111812400805F1991F7222015@red-msg-08.dns.microsoft.com> Thanks for the feedback, I hope we don't hit this brick wall you mention with XSL. First of all it is an extendable language so we can add more functionality to the language itself easily. Second, we can extend the set of flow objects to provide more functionality. Third, we have the escape to JavaScript where you can program advanced functionality yourself. JavaScript is of course not as fast as Java so this is a limited option. Istvan > -----Original Message----- > From: Michael Kay [SMTP:M.H.Kay@eng.icl.co.uk] > Sent: Friday, January 30, 1998 9:25 AM > To: xml-dev@ic.ac.uk > Subject: Re: First experiences with XSL > > > > >The goal is of course to create something reusable, instead of everybody > >writing their own favorite and incompatible processors. An other goal is > to > >let non-programmers take advantage of a style sheet language. An other > goal > >is to keep the language as declarative as possible so if you want to > >effeciently re-generate PART of the output again because the source XML > >document or XSL style sheet changed you can do this. > > > >No argument here that the current XSL processor is limited. We are > working > >on it (as I assume a couple of other companies). Does this answer your > >question ? > > > My posting wasn't intended to be negative, merely to report early > experience > to provide feedback, which is surely necessary if these goals are to be > accomplished. > > My expectation was that XSL would be rather like a report writer: much > quicker > than programming to achieve simple tasks, but limited in capability. > Report > writers have always had this "brick wall" problem: when a simple report > gets > more and more complex, you have to switch technology and start again. But > my experience, which I wanted to report, was that I could define a general > and reusable framework in which programming simple reports in Java was > just as easy as programming them in XSL, without the brick wall problem. > > If XSL can be turned into a tool for non-programmers, then it will > certainly > serve a useful purpose. I don't think it's there yet, but I wish the > endeavour > well. > > regards, Mike Kay > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Jan 30 18:21:30 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <199801301821.SAA24785@mail.iol.ie> [Michael Kay] >I've now implemented the same thing without XSL: I wrote an MSXML >application in Java that does a recursive walk down the document tree and >calls a registered "handler" class to process each element type. Over the years oodles of languages have been used/invented to munge SGML/XML in this fashion. Off the top of my head: Perl Python C C++ Balise Omnimark Adept Softquad Sculptor Metamorphosis tcl Scheme And now, of course, Java. Because they are all fully fledged programming languages you can do essentially *anything* with them. You could, for example, adjust the point size of your HTML headings based on the Netscape share price pulled live from www.qoute.com divided by the average seek time of your hard disk:-) However, a fully blown programming language is overkill for a lot of rendering applications. You can do a lot with FOSI. You can do a lot with the Panorama Stylesheet language. You can do a lot with XSL-Strawman. The people designing it know full well that there are limits to any declarative syntax (that is why DSSSL has Scheme built in) As Larry Wall put it - there is a niche for technologies that make it easy to go from 0 to 60 but some people need to go from 60 to 100. Sean Mc Grath sean at digitome dot com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at arbortext.com Fri Jan 30 18:56:13 1998 From: paul at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:00:02 2004 Subject: ArborText Announces XML Styler Message-ID: <3.0.32.19980130095947.006c8c20@pophost.arbortext.com> ArborText Announces XML Styler Leverages Extensible Style Language (XSL) Support in Microsoft Internet Explorer 4.0 PALM SPRINGS, Calif. (Jan. 27, 1998) - ArborText, the world's leading provider of content creation and management software for Enterprise XML applications, today announced the availability of XML Styler, a stylesheet editor for the Extensible Markup Language (XML). Concurrent with the announcement, ArborText and Microsoft Corporation will be offering demonstrations of Extensible Style Language (XSL) support in Internet Explorer 4.0 at the Web Tech Ed Conference in Palm Springs. Additional demonstrations will be provided by ArborText at Internet Showcase in San Diego immediately following Web Tech Ed. XSL is the style specification language being developed in conjunction with the XML initiative. In September, a proposal for the XSL specification was submitted to the World Wide Web Consortium (W3C) by ArborText, Inso and Microsoft. ArborText's XML Styler is a tool for creating and modifying XSL stylesheets that offers a graphical user interface designed to enable Web content providers to work with XSL stylesheets without requiring understanding of the many syntactic and structural details of XSL. Extensible Style Language allows for the seamless Web presentation of documents based on XML and Hypertext Markup Language (HTML), telling a Web browser how to present media-independent information through a separation of form and content. "XML Styler is a major breakthrough for XML to deliver on its vast application potential," said Jim Sterken, president, CEO and founder of ArborText. "There are literally millions of existing documents that can be translated on-the-fly into XML, and our demonstrations with Internet Explorer 4.0 show that browser support for XML stylesheets is not just 'proof-of-concept' - it is real." XML Styler streamlines the process of developing and altering these stylesheets. To achieve maximum portability, XML Styler is based on Java code and runs on Windows 95 and Windows NT. XML Styler is immediately available for free download from the ArborText Web site . "XML offers content publishers and application developers the potential to deliver data in new and exciting ways," comments Mary Laplante, director in the Document Software Strategies Group at CAP Ventures Inc., the strategic consulting and research firm that tracks document technologies markets. "The application floodgates will open when the browser vendors deliver robust support for XML later this year, generating demand for XML tools. With products like XML Styler, ArborText is ready to meet that demand. The company is clearly staking a claim on a leadership position in the market for XML-aware software." "We are proud of our efforts along with ArborText and Microsoft in developing the XSL specification," said Sharon Adler, senior product manager, Inso Corporation, and a catalyst of the XSL initiative. "A standard stylesheet language is crucial to enable sophisticated XML documents to be effectively processed and displayed by XML-compliant browsers such as IE 4. In addition, products that make it easy to work with XML and XSL, such as XML Styler and Inso's DynaTag, are important new developments that will ensure widespread adoption of these important pending standards." Tod Nielsen, general manager of developer relations at Microsoft, said, "XML and XSL will be significant Internet technologies for 1998. We are excited that our work with ArborText has yielded demonstrable support for XSL in Internet Explorer 4.0." About the ADEPT Software Series ArborText's ADEPT family of adaptable standards-based software allows users to create and maintain textual and graphic information as reusable elements independent of formatting, media, and computer software or hardware. Reusable document elements make document preparation and publication more efficient and effective in a wide variety of applications ranging from technical publications to web site management. The ADEPT family's authoring, editing and publishing software is tightly integrated with third party document management software to enable high performance, enterprise-wide knowledge processing solutions. About ArborText Based in Ann Arbor, Michigan, ArborText develops and supports software that makes the process of capturing and delivering knowledge more effective. Global 5,000 organizations use the company's products to author, catalog, and reuse information stored in document databases. In production use since 1991, ArborText software is the keystone of high-volume document assembly systems at companies such as Boeing, Digital Equipment Corporation, Ford, Grolier's Encyclopedia, Lockheed Martin, National Semiconductor, and Sun Microsystems. For more information about ArborText's products, consulting services and training programs, contact ArborText at +1 313.997.0200, send e-mail to info@arbortext.com, or visit the ArborText Web site located at http://www.arbortext.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ser at javalab.uoregon.edu Fri Jan 30 21:38:31 1998 From: ser at javalab.uoregon.edu (Sean Russell) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL References: <01bd2d90$7dc5d6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <34D2471F.75AAC02D@javalab.uoregon.edu> Michael Kay wrote: > I've now implemented the same thing without XSL: I wrote an MSXML > application in Java that does a recursive walk down the document tree and > calls a registered "handler" class to process each element type. I added a > number of helper methods such as isFirstOfType() to allow the handlers to > get information about their context more easily. > > Here is an example of one of the handlers (for the XML element tagged > SPEECH): I'm surprised that in a "developer" mailing list, no one has pointed out the obvious: authoring a translator for every DTD is a one-to-one solution, with a complexity of O(n*m). This is fine if all you ever want to do is output to HTML, and the only people who will want to change your layout are programmers (rather than layout-designers). What happens when you want to output to LaTeX, or Postscript, or troff? For every DTD you have to write a new translator. What happens when the company graphic artists want to change the layout of the web site? They have to involve the software engineering department. XSL (as pertains to the flow-layout characteristics thereof, and style sheet solutions in general) are general solutions. For every output format, you must write only one translator. For every DTD, you must write only one style sheet. The complexity, therefore, is O(n+m), and you have the additional advantage that your graphic designers can layout pages, rather than your programmers. -- |.. --------------------- Sean Russell ---------------------- <|> ser@javalab.uoregon.edu <-> http://jersey.uoregon.edu/ser /|\ ------- [ Software Engineer ] -------- /| [ PGP info available from my web site ] xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Jan 30 22:21:42 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <000401bd2dcc$ce70f710$2ee044c6@donpark> >XSL (as pertains to the flow-layout characteristics thereof, and style sheet >solutions in general) are general solutions. For every output format, you must >write only one translator. For every DTD, you must write only one style sheet. >The complexity, therefore, is O(n+m), and you have the additional advantage that >your graphic designers can layout pages, rather than your programmers. XSL can be broken up into reusable pieces, style patterns with parameters, which can be recombined using a simple (from the users POV and definitely not from that of implementors!;-) GUI tool. Parameters in the style patterns can be made late binding so that final XSL is put together by the web server after determining client configuration or by the client itself. Certain style patterns can have be defined to have attraction profile attached so that they can adapt to changes in DTD. Just a tool developers point of view, Don Park http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Jan 31 00:17:10 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:02 2004 Subject: XML-Data Questions Message-ID: <000001bd2ddc$ec7641b0$2ee044c6@donpark> I have some questions about the XML-Data spec which affects implementation: 1. How are the schemas referenced from XML documents? For DTDs, declaration or reference is within the tag. Are we supposed to use the tag also for XML-Data schemas? Examples in the spec uses lines had nothing except some namespace tags. Will there be a MIME type (text/x-xml-data?) and file extension (foo.xdl?) for XML-Data schema? 2. How does one validate XML documents which use XML-Data schema rather than DTD? Am I supposed to validate the XML-Data file first and then validate the document? What about the DTD file for the XML-Data file? 3. Current XML-Data does not allow or rather make it easy for enumerated attribute values to contain spaces becuase space is used as delimeters. Why not use the following structure to define enumerated attribute values? children adult adult I would appreciate insights and comments from those more familiar with XML-Data than I am. Regards, Don Park http://www.quake.net/~donpark/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark_Harmison at wb.xerox.com Sat Jan 31 01:13:26 1998 From: Mark_Harmison at wb.xerox.com (Harmison,Mark) Date: Mon Jun 7 17:00:02 2004 Subject: First experiences with XSL Message-ID: <2F4ED23481B1677C2F4ED23481B1677C#064#X-WB-0845-MS2.xerox@SMF> Michael Kay wrote: > I've downloaded MSXSL and used it to generate HTML for a couple of document > types, successfully but with a certain amount of frustration caused by (a) > lack of diagnostics when I got things wrong, and (b) limited functionality. > > I've now implemented the same thing without XSL: I wrote an MSXML > application in Java that does a recursive walk down the document tree and >calls a registered "handler" class to process each element type. I added a > number of helper methods such as isFirstOfType() to allow the handlers to > get information about their context more easily. > ... >Any XSL enthusiasts want to prove me wrong? -------------------- These are interesting points and should spark some good discussion. I have had a long-standing interest in a notion of "compiled" stylesheets. Compiled stylesheets would be generated from interactive tools, similar to today's stylesheet editors. The editor would output formatting source which could be compiled with a standard compiler (such as Java or C++). The rendering engine would then simply call the compiled functions for each element. Both speed and richness in the run-time environment would be improved over run-time stylesheet interpretation. However, the stylesheet editor must have a way of storing the designer's original intend. XSL provides a good declarative syntax and metaphor for doing this. Non-programmers can understand "rules", "patterns", "actions", and "flow objects" pretty well. A stylesheet editor can read and write XSL easily and reliably. Because XSL is declarative, it is much easier for a program to read it than a procedural specification of formatting would be. Here's a suggestion: If you want faster formatting than a given XSL processor can give you, or you want access to the system in a way not possible through the declarative language, you write a program which turns a declarative XSL stylesheet into a compilable program which can then be extended using the compiled language. The best of both worlds. Having said all that, I've been doing stylesheet junk for about 5 years and used a bunch of different system and performance of the stylesheet engine is not usually the problem. Access and extensibility are more commonly the trouble. Where a good escape to native code or a good "scripting language" mechanism has existed, these have been plenty. I am both a programmer and sometime-stylesheet person. I can tell for much experience that programmers get really bored doing stylesheet type programming (just like we got bored writing reports and forms). So, we invent tools to let "lesser mortals" do that work ;-), hence stylesheet languages. If you really want to write stylesheets in Java, have fun, but I bet the fun won't last more than a few weeks. Mark Harmison Xerox Corporation Mark_Harmison@wb.xerox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dcarlson at ontogenics.com Sat Jan 31 03:38:29 1998 From: dcarlson at ontogenics.com (Dave Carlson) Date: Mon Jun 7 17:00:02 2004 Subject: problems with emacs xml-mode Message-ID: <2.2.32.19980131033229.00e3a6c8@pop.dimensional.com> I have the xml-mode installed for emacs (running on WinNT 4.0), using the most current versions of each. Although I can edit XML docs reasonably well, I have a couple of significant problems. I'm not a regular emacs user, so maybe the solution lies is configuring emacs for NT. 1. The right mouse button doesn't bring up the context menu, but shift-rightMouse does bring up the menu. Right mouse button alone does nothing. 2. The DTD is parsed, but all element names are folded into all lower case. Does the current version of xml-mode support mixed-case element names? If so, what am I doing wrong? 3. The attributes defined for an element are added into the document tag in reverse order compared to their definition in the DTD. I'd prefer consistent ordering. For example: Produces the XML document: 4. Font highlighting has some problems. I've configuring my _emacs file according to earlier posts in this list, but the text highlighing only appears after I've used the context menu to insert a new tag. Then, the text is only highlighted from that point *backward* in the document. When I first load a document, no text is highlighted. I'd be very grateful for any suggetions to fix these problems! Cheers, Dave Carlson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Jan 31 06:18:46 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:02 2004 Subject: Revised XML document collections Message-ID: <199801310617.WAA17218@boethius.eng.sun.com> Revised versions of my XML-tagged Religion and Shakespeare sets are now available at http://sunsite.unc.edu/pub/sun-info/xml/eg/religion.1.10.xml.zip http://sunsite.unc.edu/pub/sun-info/xml/eg/shakespeare.1.10.xml.zip As usual, I note that the documents in these collections do not exercise most of the features of XML, but they are real documents of fairly considerable size that are useful in trying out certain kinds of XML tools. They are also fun to read. I have taken advantage of this revision to incorporate some corrections that have been accumulating since these collections were first made publicly available in 1994. I would like to thank everyone who has contributed to this effort, especially the anonymous workers who created the ASCII texts upon which the marked-up versions of the religious works were based; Moby Lexical Tools, for putting the ASCII versions of Shakepeare's plays into the public domain; Eve Maler, for her help in getting my old SGML DTDs into XML; Simon St. Laurent, for finding a patch of bad markup in Macbeth; and Yuichi Tanaka, for a number of small corrections to the Old Testament and in particular for drawing my attention to the missing subdivisions in Psalm 119, which have been restored and are now reflected in new div and divtitle elements in the tstmt DTD. These files may be freely distributed as long as the integrity of the sets is maintained. Jon ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 901 San Antonio Road, MPK17-101 | Best is he that inuents, Palo Alto, California 94303 | the next he that followes ISO/IEC JTC1/WG4::NCITS V1::SGML Open | forth and eekes out a good Davenport Group::W3C XML WG and SIG | inuention. ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Jan 31 07:11:57 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:02 2004 Subject: XSL workshop at WWW7 Message-ID: <199801310710.XAA17449@boethius.eng.sun.com> I'm starting to accept proposals for presentations at a workshop on XSL to be held April 14, 1998, at the Seventh International World Wide Web Conference in Brisbane, Australia. See http://www7.conf.au/ for information on the WWW7 conference and http://sunsite.unc.edu/pub/sun-info/xsl/www7/wks/ for the beginnings of a workshop web page. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Jan 31 10:34:04 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:02 2004 Subject: Revised XML document collections In-Reply-To: <199801310617.WAA17218@boethius.eng.sun.com> Message-ID: <3.0.1.16.19980131102252.09d77a24@pop3.demon.co.uk> At 22:17 30/01/98 -0800, Jon Bosak wrote: [...] > >These files may be freely distributed as long as the integrity of the >sets is maintained. > >Jon I owe Jon an apology. Because of memory and space limitations I included *part* of a Shakespeare play with the JUMBO distribution and I overlooked this (totally laudable) constraint. I shall remedy this in future versions. I hope however to distribute software which specifically processes Shakespeare files. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From billsand at ix.netcom.com Sat Jan 31 17:36:38 1998 From: billsand at ix.netcom.com (Bill Sanstrom) Date: Mon Jun 7 17:00:03 2004 Subject: MSXML C++ Parser (IE3.0 vs IE4.0) Message-ID: <199801311735.LAA21935@dfw-ix12.ix.netcom.com> I have searched high and low for an resolution to this dilema: The MSXML C++ parser only works if IE4.0 is installed. I am trying to use the parser for data exchange between applications all residing on a local machine. I am not concerned about browsing at this time, just the parsing and creation of the XML files. The IXMLDocument::put method fails if IE3.0 is installed. I have installed and registered the msxml.dll component and am assuming the problem is caused because it depends on other components. Has anyone been able to parse a local XML file using the msxml.dll via VC++ without having IE4.0 installed? If so, what other components are require? (Note: the simple XML VC++ program I am developing works great when IE4.0 is installed). Any response will be greatly appreciated. Bill Sanstrom billsand@ix.netcom.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony.stewart at rivcom.com Sat Jan 31 18:16:18 1998 From: tony.stewart at rivcom.com (Tony Stewart) Date: Mon Jun 7 17:00:03 2004 Subject: XSL/XML/XLL and VRML (was: Re: Conditional actions in XSL?) Message-ID: <4955E202FE46D11195C500609712EB6B05C193@FLPS-NTSERVER1> Len Bullard wrote: >>It can do what DTDs do well: provide a precise description of the presentation style of the interface as a set of routed behaviors. I would have thought that a good DTD doesn't do this at all. The DTD should define the information content, leaving both style and (IMO) behavior to be specified in a stylesheet that is tailored to this specific usage of the information. Thus, it is the style sheet describes the presentation style, not the DTD. Otherwise, how are you going to reuse the information in other formats? You're not going to want to change the DTD. And you may not have permission to do so in any case. Since this is all pretty basic religious thinking, perhaps I misunderstood you. Regards, Tony Stewart RivCom "Publishing Structured Information" tony.stewart@rivcom.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Jan 31 23:02:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:03 2004 Subject: problems with emacs xml-mode In-Reply-To: <2.2.32.19980131033229.00e3a6c8@pop.dimensional.com> References: <2.2.32.19980131033229.00e3a6c8@pop.dimensional.com> Message-ID: <199801312257.RAA00868@unready.microstar.com> Dave Carlson writes: > I have the xml-mode installed for emacs (running on WinNT 4.0), using the > most current versions of each. Although I can edit XML docs reasonably > well, I have a couple of significant problems. I'm not a regular emacs > user, so maybe the solution lies is configuring emacs for NT. > > 1. The right mouse button doesn't bring up the context menu, but > shift-rightMouse does bring up the menu. Right mouse button alone does nothing. It depends on which Emacs you're using. With XEmacs, the right mouse button brings up the dialog; with vanilla Gnu Emacs (the source of the NT port), it seems to be shift-right-mouse. Nothing in my patches should affect the mouse buttons -- have you seen the same behaviour in PSGML without my XML patches? > 2. The DTD is parsed, but all element names are folded into all lower case. > Does the current version of xml-mode support mixed-case element names? If > so, what am I doing wrong? Are you certain that you're using the latest version of the patches (from Fall 1997) and that you're actually in XML rather than SGML mode? Does it read 'XML' or 'SGML' in the mode bar at the bottom? > 3. The attributes defined for an element are added into the document tag in > reverse order compared to their definition in the DTD. I'd prefer > consistent ordering. > For example: > > > > > Produces the XML document: > > This is another general PSGML issue -- it behaves the same way without my XML patches. > 4. Font highlighting has some problems. I've configuring my _emacs file > according to earlier posts in this list, but the text highlighing only > appears after I've used the context menu to insert a new tag. Then, the > text is only highlighted from that point *backward* in the document. When I > first load a document, no text is highlighted. Again, this is not directly related to the XML patches. PSGML will highlight only the parts of the document that it has already parsed. In Unix, at least, it will eventually parse ahead and highlight the whole thing. Thanks for the note, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)